civis.io.dataframe_to_civis¶
-
civis.io.
dataframe_to_civis
(df, database, table, api_key=None, client=None, max_errors=None, existing_table_rows='fail', diststyle=None, distkey=None, sortkey1=None, sortkey2=None, headers=None, credential_id=None, primary_keys=None, last_modified_keys=None, execution='immediate', delimiter=None, polling_interval=None, archive=False, hidden=True, **kwargs)[source]¶ Upload a pandas DataFrame into a Civis table.
The DataFrame’s index will not be included. To store the index along with the other values, use df.reset_index() instead of df as the first argument to this function.
Parameters: - df :
pandas.DataFrame
The DataFrame to upload to Civis.
- database : str or int
Upload data into this database. Can be the database name or ID.
- table : str
The schema and table you want to upload to. E.g.,
'scratch.table'
. Schemas or tablenames with periods must be double quoted, e.g.'scratch."my.table"'
.- api_key : DEPRECATED str, optional
Your Civis API key. If not given, the
CIVIS_API_KEY
environment variable will be used.- client :
civis.APIClient
, optional If not provided, an
civis.APIClient
object will be created from theCIVIS_API_KEY
.- max_errors : int, optional
The maximum number of rows with errors to remove from the import before failing.
- existing_table_rows : str, optional
The behaviour if a table with the requested name already exists. One of
'fail'
,'truncate'
,'append'
,'drop'
, or'upsert'
. Defaults to'fail'
.- diststyle : str, optional
The distribution style for the table. One of
'even'
,'all'
or'key'
.- distkey : str, optional
The column to use as the distkey for the table.
- sortkey1 : str, optional
The column to use as the sortkey for the table.
- sortkey2 : str, optional
The second column in a compound sortkey for the table.
- headers : bool, optional [DEPRECATED]
Whether or not the first row of the file should be treated as headers. The default,
None
, attempts to autodetect whether or not the first row contains headers.This parameter has no effect in versions >= 1.11 and will be removed in v2.0. Tables will always be written with column names read from the DataFrame. Use the header parameter (which will be passed directly to
to_csv()
) to modify the column names in the Civis Table.- credential_id : str or int, optional
The ID of the database credential. If
None
, the default credential will be used.- primary_keys: list[str], optional
A list of the primary key column(s) of the destination table that uniquely identify a record. If existing_table_rows is “upsert”, this field is required. Note that this is true regardless of whether the destination database itself requires a primary key.
- last_modified_keys: list[str], optional
A list of the columns indicating a record has been updated. If existing_table_rows is “upsert”, this field is required.
- escaped: bool, optional
A boolean value indicating whether or not the source file has quotes escaped with a backslash. Defaults to false.
- execution: string, optional, default “immediate”
One of “delayed” or “immediate”. If “immediate”, refresh column statistics as part of the run. If “delayed”, flag the table for a deferred statistics update; column statistics may not be available for up to 24 hours. In addition, if existing_table_rows is “upsert”, delayed executions move data from staging table to final table after a brief delay, in order to accommodate multiple concurrent imports to the same destination table.
- polling_interval : int or float, optional
Number of seconds to wait between checks for job completion.
- archive : bool, optional (deprecated)
If
True
, archive the import job as soon as it completes.- hidden : bool, optional
If
True
(the default), this job will not appear in the Civis UI.- **kwargs : kwargs
Extra keyword arguments will be passed to
pandas.DataFrame.to_csv()
.
Returns: - fut :
CivisFuture
A CivisFuture object.
See also
to_csv()
Examples
>>> import pandas as pd >>> df = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]}) >>> fut = civis.io.dataframe_to_civis(df, 'my-database', ... 'scratch.df_table') >>> fut.result()
- df :