返回技能库

PDF.co

PDF.co API 与托管 OAuth 的集成。转换、合并、拆分、编辑 PDF 并提取数据。当用户想将 PDF 转换为/从其他 fo... 时使用此技能。

作者:byungkyu · 最新版本:1.0.1

收藏:1 · 下载:1.1k

说明文档

# PDF.co

Access the PDF.co API with managed authentication. Convert, merge, split, and edit PDFs with full document manipulation capabilities.

## Quick Start

```bash
# Get PDF info
python <<'EOF'
import urllib.request, os, json
data = json.dumps({'url': 'https://example.com/sample.pdf'}).encode()
req = urllib.request.Request('https://gateway.maton.ai/pdf-co/v1/pdf/info', data=data, method='POST')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
req.add_header('Content-Type', 'application/json')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
```

## Base URL

```
https://gateway.maton.ai/pdf-co/{native-api-path}
```

Replace `{native-api-path}` with the actual PDF.co API endpoint path. The gateway proxies requests to `api.pdf.co` and automatically injects your API credentials.

## Authentication

All requests require the Maton API key in the Authorization header:

```
Authorization: Bearer $MATON_API_KEY
```

**Environment Variable:** Set your API key as `MATON_API_KEY`:

```bash
export MATON_API_KEY="YOUR_API_KEY"
```

### Getting Your API Key

1. Sign in or create an account at [maton.ai](https://maton.ai)
2. Go to [maton.ai/settings](https://maton.ai/settings)
3. Copy your API key

## Connection Management

Manage your PDF.co connections at `https://ctrl.maton.ai`.

### List Connections

```bash
python <<'EOF'
import urllib.request, os, json
req = urllib.request.Request('https://ctrl.maton.ai/connections?app=pdf-co&status=ACTIVE')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
```

### Create Connection

```bash
python <<'EOF'
import urllib.request, os, json
data = json.dumps({'app': 'pdf-co'}).encode()
req = urllib.request.Request('https://ctrl.maton.ai/connections', data=data, method='POST')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
req.add_header('Content-Type', 'application/json')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
```

### Get Connection

```bash
python <<'EOF'
import urllib.request, os, json
req = urllib.request.Request('https://ctrl.maton.ai/connections/{connection_id}')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
```

**Response:**
```json
{
  "connection": {
    "connection_id": "21fd90f9-5935-43cd-b6c8-bde9d915ca80",
    "status": "ACTIVE",
    "creation_time": "2025-12-08T07:20:53.488460Z",
    "last_updated_time": "2026-01-31T20:03:32.593153Z",
    "url": "https://connect.maton.ai/?session_token=...",
    "app": "pdf-co",
    "metadata": {}
  }
}
```

Open the returned `url` in a browser to complete authorization.

### Delete Connection

```bash
python <<'EOF'
import urllib.request, os, json
req = urllib.request.Request('https://ctrl.maton.ai/connections/{connection_id}', method='DELETE')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
```

### Specifying Connection

If you have multiple PDF.co connections, specify which one to use with the `Maton-Connection` header:

```bash
python <<'EOF'
import urllib.request, os, json
data = json.dumps({'url': 'https://example.com/sample.pdf'}).encode()
req = urllib.request.Request('https://gateway.maton.ai/pdf-co/v1/pdf/info', data=data, method='POST')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
req.add_header('Content-Type', 'application/json')
req.add_header('Maton-Connection', '21fd90f9-5935-43cd-b6c8-bde9d915ca80')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
```

If omitted, the gateway uses the default (oldest) active connection.

## API Reference

### PDF Information

Get metadata and information about a PDF file.

```bash
POST /pdf-co/v1/pdf/info
Content-Type: application/json

{
  "url": "https://example.com/document.pdf"
}
```

### Convert PDF to Text

```bash
POST /pdf-co/v1/pdf/convert/to/text
Content-Type: application/json

{
  "url": "https://example.com/document.pdf",
  "pages": "0-",
  "inline": true
}
```

**Response:**
```json
{
  "body": "Extracted text content...",
  "pageCount": 5,
  "error": false,
  "status": 200,
  "name": "document.txt",
  "credits": 10,
  "remainingCredits": 9990
}
```

### Convert PDF to CSV

```bash
POST /pdf-co/v1/pdf/convert/to/csv
Content-Type: application/json

{
  "url": "https://example.com/document.pdf",
  "pages": "0-",
  "inline": true,
  "lang": "eng"
}
```

### Convert PDF to JSON

```bash
POST /pdf-co/v1/pdf/convert/to/json
Content-Type: application/json

{
  "url": "https://example.com/document.pdf",
  "pages": "0-",
  "inline": true
}
```

### Convert PDF to HTML

```bash
POST /pdf-co/v1/pdf/convert/to/html
Content-Type: application/json

{
  "url": "https://example.com/document.pdf",
  "pages": "0-",
  "name": "output.html"
}
```

### Convert PDF to XLSX (Excel)

```bash
POST /pdf-co/v1/pdf/convert/to/xlsx
Content-Type: application/json

{
  "url": "https://example.com/document.pdf",
  "pages": "0-",
  "name": "output.xlsx"
}
```

### Convert PDF to PNG

```bash
POST /pdf-co/v1/pdf/convert/to/png
Content-Type: application/json

{
  "url": "https://example.com/document.pdf",
  "pages": "0",
  "name": "page.png"
}
```

### Convert PDF to JPG

```bash
POST /pdf-co/v1/pdf/convert/to/jpg
Content-Type: application/json

{
  "url": "https://example.com/document.pdf",
  "pages": "0",
  "name": "page.jpg"
}
```

### Convert HTML to PDF

```bash
POST /pdf-co/v1/pdf/convert/from/html
Content-Type: application/json

{
  "html": "<html><body><h1>Hello World</h1></body></html>",
  "name": "output.pdf",
  "paperSize": "Letter",
  "orientation": "Portrait",
  "margins": "10 10 10 10"
}
```

**Response:**
```json
{
  "url": "https://pdf-temp-files.s3.amazonaws.com/...",
  "pageCount": 1,
  "error": false,
  "status": 200,
  "name": "output.pdf",
  "remainingCredits": 9980
}
```

### Convert URL to PDF

```bash
POST /pdf-co/v1/pdf/convert/from/url
Content-Type: application/json

{
  "url": "https://example.com",
  "name": "webpage.pdf",
  "paperSize": "A4",
  "orientation": "Portrait"
}
```

### Merge PDFs

Combine multiple PDFs into a single document.

```bash
POST /pdf-co/v1/pdf/merge
Content-Type: application/json

{
  "url": "https://example.com/doc1.pdf,https://example.com/doc2.pdf",
  "name": "merged.pdf"
}
```

**Response:**
```json
{
  "url": "https://pdf-temp-files.s3.amazonaws.com/merged.pdf",
  "pageCount": 10,
  "error": false,
  "status": 200,
  "name": "merged.pdf",
  "remainingCredits": 9970,
  "duration": 1500
}
```

### Split PDF

Split a PDF into multiple files.

```bash
POST /pdf-co/v1/pdf/split
Content-Type: application/json

{
  "url": "https://example.com/document.pdf",
  "pages": "1-3,4-6,7-"
}
```

**Response:**
```json
{
  "urls": [
    "https://pdf-temp-files.s3.amazonaws.com/part1.pdf",
    "https://pdf-temp-files.s3.amazonaws.com/part2.pdf",
    "https://pdf-temp-files.s3.amazonaws.com/part3.pdf"
  ],
  "pageCount": 10,
  "error": false,
  "status": 200,
  "remainingCredits": 9960
}
```

### Delete Pages

```bash
POST /pdf-co/v1/pdf/edit/delete-pages
Content-Type: application/json

{
  "url": "https://example.com/document.pdf",
  "pages": "2,4,6"
}
```


### Add Text and Images

Add text, images, or other content to a PDF.

```bash
POST /pdf-co/v1/pdf/edit/add
Content-Type: application/json

{
  "url": "https://example.com/document.pdf",
  "name": "annotated.pdf",
  "annotations": [
    {
      "text": "CONFIDENTIAL",
      "x": 100,
      "y": 100,
      "size": 24,
      "pages": "0-"
    }
  ]
}
```

### Search and Replace Text

```bash
POST /pdf-co/v1/pdf/edit/replace-text
Content-Type: application/json

{
  "url": "https://example.com/document.pdf",
  "searchString": "old text",
  "replaceString": "new text"
}
```

### Search and Delete Text

```bash
POST /pdf-co/v1/pdf/edit/delete-text
Content-Type: application/json

{
  "url": "https://example.com/document.pdf",
  "searchString": "text to remove"
}
```

### Add Password

```bash
POST /pdf-co/v1/pdf/security/add
Content-Type: application/json

{
  "url": "https://example.com/document.pdf",
  "ownerPassword": "owner123",
  "userPassword": "user456"
}
```

### Remove Password

```bash
POST /pdf-co/v1/pdf/security/remove
Content-Type: application/json

{
  "url": "https://example.com/document.pdf",
  "password": "currentpassword"
}
```

### AI Invoice Parser

Automatically extract structured data from invoices.

```bash
POST /pdf-co/v1/ai-invoice-parser
Content-Type: application/json

{
  "url": "https://example.com/invoice.pdf"
}
```

### Document Parser

Extract data using templates.

```bash
POST /pdf-co/v1/pdf/documentparser
Content-Type: application/json

{
  "url": "https://example.com/document.pdf",
  "templateId": "your-template-id"
}
```


### Generate Barcode

```bash
POST /pdf-co/v1/barcode/generate
Content-Type: application/json

{
  "value": "1234567890",
  "type": "QRCode",
  "name": "barcode.png"
}
```

### Read Barcode

```bash
POST /pdf-co/v1/barcode/read/from/url
Content-Type: application/json

{
  "url": "https://example.com/barcode.png",
  "types": "QRCode,Code128,Code39,EAN13,UPCA"
}
```

### Check Job Status (Async)

For async operations, check job status.

```bash
POST /pdf-co/v1/job/check
Content-Type: application/json

{
  "jobId": "abc123"
}
```

## Async Processing

For large files or batch operations, use async processing:

```bash
POST /pdf-co/v1/pdf/merge
Content-Type: application/json

{
  "url": "https://example.com/large1.pdf,https://example.com/large2.pdf",
  "async": true,
  "name": "merged.pdf"
}
```

**Response:**
```json
{
  "jobId": "abc123",
  "status": "working",
  "error": false
}
```

Then poll the job status:

```bash
POST /pdf-co/v1/job/check
Content-Type: application/json

{
  "jobId": "abc123"
}
```

## Code Examples

### JavaScript

```javascript
const response = await fetch(
  'https://gateway.maton.ai/pdf-co/v1/pdf/merge',
  {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.MATON_API_KEY}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      url: 'https://example.com/doc1.pdf,https://example.com/doc2.pdf',
      name: 'merged.pdf'
    })
  }
);
const result = await response.json();
console.log(result.url);
```

### Python

```python
import os
import requests

response = requests.post(
    'https://gateway.maton.ai/pdf-co/v1/pdf/merge',
    headers={'Authorization': f'Bearer {os.environ["MATON_API_KEY"]}'},
    json={
        'url': 'https://example.com/doc1.pdf,https://example.com/doc2.pdf',
        'name': 'merged.pdf'
    }
)
result = response.json()
print(result['url'])
```

## Notes

- All file URLs must be publicly accessible or use PDF.co temporary storage
- Multiple URLs for merge operations should be comma-separated
- Page indices are 0-based (first page is `0`)
- Page ranges use format: `0-2` (pages 0,1,2), `3-` (page 3 to end), `0,2,4` (specific pages)
- Output files are stored temporarily and expire after 60 minutes by default
- Use `async: true` for large files to avoid timeout
- Use `inline: true` to get content directly in response instead of URL
- IMPORTANT: When using curl commands, use `curl -g` when URLs contain brackets to disable glob parsing
- IMPORTANT: When piping curl output to `jq` or other commands, environment variables like `$MATON_API_KEY` may not expand correctly in some shell environments

## Error Handling

| Status | Meaning |
|--------|---------|
| 400 | Missing PDF.co connection or invalid request |
| 401 | Invalid or missing Maton API key |
| 429 | Rate limited |
| 4xx/5xx | Passthrough error from PDF.co API |

### Troubleshooting: API Key Issues

1. Check that the `MATON_API_KEY` environment variable is set:

```bash
echo $MATON_API_KEY
```

2. Verify the API key is valid by listing conne...