Forge OCR Server v0.1.0
Multi-layer document OCR with ROI cropping, consensus and validation
API Endpoints
POST
/api/v1/ocr/{doc_type}
OCR any document (multipart)
POST
/api/v1/ocr/{doc_type}/json
OCR any document (base64 JSON)
POST
/api/v1/pan/ocr
PAN card OCR (multipart)
POST
/api/v1/pan/ocr/json
PAN card OCR (base64 JSON)
GET
/api/v1/documents
List document types
GET
/api/v1/layers
Available OCR layers
GET
/api/v1/health
Health check
Supported Documents
Aadhaar Card (UIDAI, India)
doc_type: aadhaar_card
aadhaar_number Aadhaar Number requiredname Name requireddate_of_birth Date of Birth requiredgender Gender requiredaddress Address optionalmasked_number Masked Aadhaar Number optional
PAN Card (India)
doc_type: pan_card
name Name requiredfathers_name Father's Name requireddate_of_birth Date of Birth requiredpan_number PAN Number required
OCR Engines
| ID | Description | Weight | Status |
llm_workers_ai |
Cloudflare Workers AI Vision OCR |
0.95 |
active |
ocrs_deep_learning |
OCRS deep learning OCR (RTen inference) |
0.50 |
active |
tesseract |
Tesseract LSTM OCR engine |
0.70 |
active |
onnx_ocr |
PaddleOCR v4 English recognition (ONNX Runtime) |
0.60 |
active |
Pipeline
- Upload image (multipart or base64 JSON)
- Per-field ROI cropping from document layout
- Multiple OCR engines process each field region in parallel
- Results merged via weighted consensus per field
- Each field validated (format, checksum, cross-field checks)
- Response: extracted fields, confidence score, review recommendation
Quick Start
curl https://ocr.setulab.com/api/v1/health
curl -X POST https://ocr.setulab.com/api/v1/pan/ocr \
-F "image=@pan_card.jpg"
curl -X POST https://ocr.setulab.com/api/v1/ocr/aadhaar_card \
-F "[email protected]"
curl -X POST https://ocr.setulab.com/api/v1/pan/ocr \
-F "image=@pan_card.jpg" \
-F "include_layers=true"
curl -X POST https://ocr.setulab.com/api/v1/pan/ocr \
-F "image=@pan_card.jpg" \
-F "layers=tesseract,ocrs_deep_learning" \
-F "include_layers=true"