Hospital Job Alert Service
2024-08-02 — 2024-08-14
Serverless backend service that automatically collects job postings from 4 university hospitals and delivers keyword-based email notifications
Problem Solving
Job seekers manually checking 4 hospital career sites daily
Strategy pattern per hospital — API (Severance, Ewha, Chung-Ang) + Puppeteer (Chung-Ang Heukseok), hourly auto-scraping
Auto-collected from 4 hospitals + keyword-matched email notifications
Risk of missing postings when email notification fails
Pessimistic lock notification queue, batch processing (10), up to 3 retries on failure
External ID dedup + reliable notification delivery guaranteed
Project Description
Built to eliminate the hassle of manually checking multiple hospital career sites for medical job seekers. The service automatically scrapes job postings from 4 university hospitals (Severance, Ewha, Chung-Ang, etc.) every hour and sends email notifications when new postings match subscriber-registered keywords. Applied the Strategy pattern to handle varying hospital site structures — separating API-based scraping and Puppeteer browser automation per hospital. Ensured email delivery reliability through pessimistic locking and retry logic on the notification queue.
Highlights
- Hourly automated collection from 4 hospitals with Strategy pattern
- Pessimistic locking on notification queue preventing duplicate sends
- Dual deployment: Docker on EC2 + Serverless Framework on Lambda
- GitHub Actions CI/CD with automated post-deploy health checks
Performance Metrics
| Performance Metrics | Before | After |
|---|---|---|
| Scraping targets | 수동 확인 | 4개 병원 매시간 자동 (자동화) |
| Notification reliability | 단건 발송 | 배치 10건 + 3회 재시도 (비관적 락) |
Tech Decisions
- ▶ Strategy pattern: handles varying hospital site structures — separates API-based scraping (Severance, Ewha, Chung-Ang) from Puppeteer automation (Chung-Ang Heukseok)
- ▶ Pessimistic lock notification queue: prevents duplicate sends under concurrent batch execution, batch size 10 + up to 3 retries on failure for reliable email delivery
Lessons Learned
- • Achieved scraping logic extensibility and maintainability by applying Strategy pattern to handle varying hospital site structures
- • Designed extensible architecture enabling new hospital additions without modifying existing code through runtime algorithm switching
- • Experienced dual deployment architecture combining EC2 persistent server with Lambda serverless for cost efficiency and reliability
- • Ensured deployment reliability through GitHub Actions CI/CD pipeline with automated post-deploy health checks