civis.io.civis_to_multifile_csv

Unload the result of SQL query and return presigned urls.

This function is intended for unloading large queries/tables from redshift as it uses a ‘PARALLEL ON’ S3 unload. It returns a similar manifest file to conventional S3 UNLOAD statements except the CSV parts are accessible via both files endpoint IDs and presigned S3 urls.

Parameters:

sqlstr: The SQL select string to be executed.
databasestr or int: Execute the query against this database. Can be the database name or ID.
sql_params_argumentsdict, optional: A dictionary of SQL query parameters to pass directly to civis.APIClient.scripts.post_sql. The only allowed keys are "params" (whose value is a list[dict]) and "arguments" (whose value is a dict). Please refer to the linked API documentation for how to format these two keys’ values.
job_namestr, optional: A name to give the job. If omitted, a random job name will be used.
clientcivis.APIClient, optional: If not provided, an civis.APIClient object will be created from the CIVIS_API_KEY.
credential_idstr or int, optional: The database credential ID. If None, the default credential will be used.
include_header: bool, optional: If True include a key in the returned dictionary containing a list of column names. Default: True.
compression: str, optional: Type of compression to use, if any. One of 'none', 'zip', or 'gzip'. Default 'none'.
delimiter: str, optional: Which delimiter to use, if any. One of ',', ' ', or '|'. Default: '|'.
max_file_size: int, optional: Maximum number of Megabytes each created file will be.
unquoted: bool, optional: Whether or not to quote fields. Default: False.
prefix: str, optional: A user specified filename prefix for the output file to have. Default: None.
polling_intervalint or float, optional: Number of seconds to wait between checks for query completion.
hiddenbool, optional: If True (the default), this job will not appear in the Civis UI.

Returns:

unload_manifest: dict

A dictionary resembling an AWS manifest file. Has the following keys:

‘query’: str

The query.

‘header’: list of str

The columns from the query.

‘entries’: list of dict

Each dict has the following keys:

‘id’: int: File ID
‘name’: str: Filename
‘size’: int: File size in bytes
‘url’: str: Unsigned S3 URL (‘s3://…’)
‘url_signed’: str: Signed S3 URL (’https://…’)

‘unquoted’: bool

Whether the cells are quoted.

‘compression’: str

Type of compression used.

‘delimiter’: str

Delimiter that separates the cells.