Skip to content
Personal 4 university hospitals auto-scraping · hourly batch

Hospital Job Alert Service

2024-08-02 — 2024-08-14

Serverless backend service that automatically collects job postings from 4 university hospitals and delivers keyword-based email notifications

Hospital Job Alert Service project cover image

Problem Solving

1

Job seekers manually checking 4 hospital career sites daily

Solution Process

Strategy pattern per hospital — API (Severance, Ewha, Chung-Ang) + Puppeteer (Chung-Ang Heukseok), hourly auto-scraping

Result

Auto-collected from 4 hospitals + keyword-matched email notifications

2

Risk of missing postings when email notification fails

Solution Process

Pessimistic lock notification queue, batch processing (10), up to 3 retries on failure

Result

External ID dedup + reliable notification delivery guaranteed

Project Description

Built to eliminate the hassle of manually checking multiple hospital career sites for medical job seekers. The service automatically scrapes job postings from 4 university hospitals (Severance, Ewha, Chung-Ang, etc.) every hour and sends email notifications when new postings match subscriber-registered keywords. Applied the Strategy pattern to handle varying hospital site structures — separating API-based scraping and Puppeteer browser automation per hospital. Ensured email delivery reliability through pessimistic locking and retry logic on the notification queue.

Highlights

  • Hourly automated collection from 4 hospitals with Strategy pattern
  • Pessimistic locking on notification queue preventing duplicate sends
  • Dual deployment: Docker on EC2 + Serverless Framework on Lambda
  • GitHub Actions CI/CD with automated post-deploy health checks

Performance Metrics

Performance Metrics Before After
Scraping targets 수동 확인 4개 병원 매시간 자동 (자동화)
Notification reliability 단건 발송 배치 10건 + 3회 재시도 (비관적 락)

Tech Decisions

  • Strategy pattern: handles varying hospital site structures — separates API-based scraping (Severance, Ewha, Chung-Ang) from Puppeteer automation (Chung-Ang Heukseok)
  • Pessimistic lock notification queue: prevents duplicate sends under concurrent batch execution, batch size 10 + up to 3 retries on failure for reliable email delivery

Lessons Learned

  • Achieved scraping logic extensibility and maintainability by applying Strategy pattern to handle varying hospital site structures
  • Designed extensible architecture enabling new hospital additions without modifying existing code through runtime algorithm switching
  • Experienced dual deployment architecture combining EC2 persistent server with Lambda serverless for cost efficiency and reliability
  • Ensured deployment reliability through GitHub Actions CI/CD pipeline with automated post-deploy health checks

Tech Stack

NestJS TypeScript PostgreSQL TypeORM Puppeteer Cheerio Nodemailer Docker AWS EC2 AWS RDS AWS Lambda Serverless Framework

Project Images