Data Loading Overview
Cloudberry Database loads data mainly by transforming external data into external tables (or foreign tables) via loading tools. Then it reads data from these external tables or writes data into them to achieve external data loading.
Loading process
The general process of loading external data into Cloudberry Database is as follows:
- Assess the data loading scenario (such as data source location, data type, and data volume) and select an appropriate loading tool.
- Set up and enable the loading tool.
- Create an external table, specifying information such as the protocol of the loading tool, data source address, data format in the
CREATE EXTERNAL TABLE
statement. - Once the external table is created, data from the external table can be queried directly using the
SELECT
statement, or data can be imported from the external table usingINSERT INTO SELECT
.
Loading methods and scenarios
Cloudberry Database offers multiple data loading solutions, and you can select different data loading methods according to different data sources.
Loading method | Data source | Data format | Parallel or not |
---|---|---|---|
copy | Local file system • Coordinator node host (for a single file) • Segment node host (for multiple files) | • TXT • CSV • Binary | No |
file:// protocol | Local file system (local segment host, accessible only by superuser) | • TXT • CSV | Yes |
gpfdist | Local host files or files accessible via internal network | • TXT • CSV • Any delimited text format supported by the FORMAT clause• XML and JSON (requires conversion to text format via YAML configuration file) | Yes |
Batch loading using gpload (with gpfdist as the underlying worker) | Local host files or files accessible via internal network | • TXT • CSV • Any delimited text format supported by the FORMAT clause• XML and JSON (require conversion to text format via YAML configuration file) | Yes |
Creating external web tables | Data pulled from network services or from any source accessible by command lines | • TXT • CSV | Yes |
Learn more
🗃️ Load Data from Local Files
4 items
📄️ Load Data from Web Services
In Cloudberry Database, to load data from web services or from any source accessible by command lines, you can create external web tables. The supported data formats are TEXT and CSV.