Async Workflow Details
Asynchronous conversion in PDFBolt allows you to process multiple documents simultaneously without waiting for each one to complete before starting the next. This method is particularly beneficial when dealing with large volumes of documents, as it significantly improves efficiency and throughput.
How the Asynchronous Flow Works
When you send an asynchronous conversion request to PDFBolt's API, the process unfolds as follows:
1. Submit Request
You initiate a POST
request to the /async
endpoint with the required parameters, including a webhook URL where you wish to receive the callback once the processing is complete.
2. Immediate Acknowledgment
The API immediately responds with a requestId
, acknowledging receipt of your request. This allows your application to continue executing without waiting for the PDF generation to finish.
3. Parallel Processing
PDFBolt processes the documents in the background. Multiple documents can be processed in parallel, maximizing efficiency and reducing overall processing time.
4. Direct S3 Upload (Optional)
If you've provided a customS3PresignedUrl
, the generated PDF will be uploaded directly to your own S3-compatible bucket, enhancing security and compliance.
5. Webhook Notification
Once the PDF is generated, PDFBolt sends a POST
request to your specified webhook URL, containing information such as: status
, documentUrl
, and more. For a full list of parameters, refer to the Webhook Request Parameters.
6. Document Retrieval
You can retrieve the PDF from the provided documentUrl
.
If uploaded to your S3 bucket, you’ll have immediate access within your own storage environment.
Here's a visual representation of the asynchronous flow:
Benefits of Asynchronous Conversion
-
Improved Efficiency: By not waiting for each PDF to be generated before sending the next request, you can process large batches of documents more quickly.
-
Parallel Processing: Multiple documents are processed simultaneously, maximizing resource utilization.
-
Non-Blocking Operations: Your application remains responsive, as it doesn't need to wait for the API to finish processing each document.
-
Direct S3 Uploads: With the
customS3PresignedUrl
parameter, PDFs can be uploaded directly to your S3 bucket, enhancing security and compliance.
Getting Started with Asynchronous Conversion
To use the asynchronous conversion mode:
-
Prepare Your Request: Include the
webhook
parameter in your request body, along with other required parameters. -
Set Up Your Webhook: Ensure that your webhook endpoint is ready to receive
POST
requests from PDFBolt. It should be accessible over HTTPS for security. -
Handle the Callback: When your webhook receives the callback, handle the response by checking the
status
and retrieving thedocumentUrl
. -
Retrieve the PDF: Download the PDF from the
documentUrl
provided in the webhook payload.
For detailed information on the parameters and their usage, please refer to the Conversion Parameters section.
Uploading to Your S3 Bucket
PDFBolt offers the option to upload generated PDFs directly to your own S3-compatible storage, providing enhanced security and compliance, especially with GDPR requirements. By utilizing this feature, you ensure that your documents are stored in your own environment, and you have full control over them.
Advantages of Uploading to Your S3 Bucket
-
Enhanced Security: By storing PDFs in your own S3 bucket, you reduce the risk associated with transferring files over networks and relying on third-party storage.
-
GDPR Compliance: Direct uploads to your storage mean that PDFBolt doesn't store your documents longer than necessary, aligning with GDPR's data minimization principles.
-
Control Over Data: You have full control over access permissions, storage duration, and data handling policies for your documents.
How It Works
-
Generate a Pre-Signed URL: Create a pre-signed URL that allows PDFBolt to upload the PDF directly to your S3 bucket without needing your storage credentials.
-
Include
customS3PresignedUrl
in Your Request: When making a request to the PDFBolt API, include thecustomS3PresignedUrl
parameter with the generated pre-signed URL. -
PDFBolt Uploads to Your Bucket: Once the PDF is generated, PDFBolt uses the pre-signed URL to upload the document directly to your S3 bucket.
-
Access Your PDF: You can then access the PDF from your own S3 bucket, following your usual procedures.
Supported S3-Compatible Storage
PDFBolt supports any S3-compatible storage, not just AWS S3. This means you can use services like:
- Amazon S3
- DigitalOcean Spaces
- MinIO
- Wasabi
- Backblaze B2
Example: Generating a Pre-Signed URL in Node.js
Here's how you can generate a pre-signed URL in Node.js using the AWS SDK:
const AWS = require('aws-sdk');
// Configure AWS with your access and secret key.
AWS.config.update({
accessKeyId: 'YOUR_ACCESS_KEY',
secretAccessKey: 'YOUR_SECRET_KEY',
region: 'YOUR_BUCKET_REGION'
});
const s3 = new AWS.S3();
const params = {
Bucket: 'your-bucket-name',
Key: 'path/to/your-document.pdf',
Expires: 3600, // URL expires in 1 hour
ContentType: 'application/pdf',
};
const presignedUrl = s3.getSignedUrl('putObject', params);
console.log('Pre-Signed URL:', presignedUrl);
- Ensure the pre-signed URL is valid and correctly configured to accept uploads.
Using the Pre-Signed URL with PDFBolt
Include the customS3PresignedUrl
in your request to the PDFBolt API:
{
"url": "https://example.com",
"customS3PresignedUrl": "https://your-bucket.s3.amazonaws.com/path/to/your-document.pdf?AWSAccessKeyId=YOUR_ACCESS_KEY&Expires=1609459200&Signature=YOUR_SIGNATURE"
}
- Permissions: Ensure that the IAM user or role you use to generate the pre-signed URL has the necessary permissions to perform the
putObject
operation on the specified bucket and key. - Expiration: Set an appropriate expiration time for the pre-signed URL, ensuring it remains valid during the PDF generation and upload process.
- Content Type: Specify the ContentType as
application/pdf
when generating the pre-signed URL.
For more details on how to generate pre-signed URLs in other languages, refer to the AWS SDK documentation or your storage provider's SDK.