Technology
Overcoming Synchronous Parsing Issues with Docling
R
Rohan Shrestha
December 06, 2025•
3 min read
Recently, I integrated Docling—a solid framework for document parsing and chunking—into a production system. With the introduction of their API server
docling-serve, integrating Docling has become easier than ever. However, while working on this, I encountered a few challenges with synchronous document parsing that needed a more robust solution.The Problem with Synchronous Parsing
When parsing documents synchronously, two main issues arose:
- Long Processing Times: Parsing can take several minutes (in some cases, over 5 minutes depending on the document size and complexity).
- Connection Timeouts: Maintaining an open HTTP connection for that duration is not ideal. Services like Cloudflare often cap connections at roughly 120 seconds, leading to dropped requests and failures.
The Solution: Asynchronous Workflow
To handle this, I implemented an asynchronous document processing workflow. There were two potential approaches:
- Webhooks: Expose a webhook endpoint to receive the result once processing completes.
- Polling: Implement a polling mechanism to check for completion and retrieve results once ready.
For simplicity and ease of integration without exposing new public endpoints, I chose the polling approach.
The Workflow
Here is the step-by-step workflow I implemented:
- Submit Document for Processing
- Send the document URL via an async
POSTrequest to the Docling service to initiate background processing.
- Poll for Completion
- The service returns a
task_id.
- Use a polling mechanism with exponential backoff to check the task’s status periodically.
- Backoff configuration:
- Initial interval: 2 seconds
- Multiplier: ×3
- Max interval: 60 seconds
- Timeout: 10 minutes
- Retrieve Results
- Once the status returns
success, perform a finalGETrequest to fetch the processed data.
- Handle Results and Errors
- On success: Store and handle the processed document.
- On failure or timeout: Capture and log the error for further action.
Implementation Example
Below is a simplified version of the TypeScript implementation used to handle the polling logic:
Example Polling Intervals
To avoid hammering the server while ensuring timely updates, the exponential backoff looks something like this:
- Attempt 1: Wait 2s
- Attempt 2: Wait 6s
- Attempt 3: Wait 18s
- Attempt 4: Wait 54s
- Attempt 5+: Wait 60s (capped)
Conclusion
By moving from a synchronous to an asynchronous polling model, we bypassed the timeout limitations of intermediate proxies like Cloudflare and built a more resilient document processing pipeline.
- Docling: https://www.docling.ai/
- docling-serve: https://github.com/DS4SD/docling-serve