2025-03-14 21:59:59 -06:00
|
|
|
# tabula
|
|
|
|
|
|
|
|
|
|
> Extract tables from PDF files.
|
2025-12-16 10:20:31 -07:00
|
|
|
> More information: <https://github.com/tabulapdf/tabula-java#commandline-usage-examples>.
|
2025-03-14 21:59:59 -06:00
|
|
|
|
|
|
|
|
- Extract all tables from a PDF to a CSV file:
|
|
|
|
|
|
2025-12-16 10:20:31 -07:00
|
|
|
`tabula {{[-o|--outfile]}} {{file.csv}} {{file.pdf}}`
|
2025-03-14 21:59:59 -06:00
|
|
|
|
|
|
|
|
- Extract all tables from a PDF to a JSON file:
|
|
|
|
|
|
2025-12-16 10:20:31 -07:00
|
|
|
`tabula {{[-f|--format]}} JSON {{[-o|--outfile]}} {{file.json}} {{file.pdf}}`
|
2025-03-14 21:59:59 -06:00
|
|
|
|
|
|
|
|
- Extract tables from pages 1, 2, 3, and 6 of a PDF:
|
|
|
|
|
|
2025-12-16 10:20:31 -07:00
|
|
|
`tabula {{[-p|--pages]}} 1-3,6 {{file.pdf}}`
|
2025-03-14 21:59:59 -06:00
|
|
|
|
|
|
|
|
- Extract tables from page 1 of a PDF, guessing which portion of the page to examine:
|
|
|
|
|
|
2025-12-16 10:20:31 -07:00
|
|
|
`tabula {{[-g|--guess]}} {{[-p|--pages]}} 1 {{file.pdf}}`
|
2025-03-14 21:59:59 -06:00
|
|
|
|
|
|
|
|
- Extract all tables from a PDF, using ruling lines to determine cell boundaries:
|
|
|
|
|
|
2025-12-16 10:20:31 -07:00
|
|
|
`tabula {{[-r|--spreadsheet]}} {{file.pdf}}`
|
2025-03-14 21:59:59 -06:00
|
|
|
|
|
|
|
|
- Extract all tables from a PDF, using blank space to determine cell boundaries:
|
|
|
|
|
|
2025-12-16 10:20:31 -07:00
|
|
|
`tabula {{[-n|--no-spreadsheet]}} {{file.pdf}}`
|