civis.io.dataframe_to_civis
- civis.io.dataframe_to_civis(df, database, table, client=None, max_errors=None, existing_table_rows='fail', diststyle=None, distkey=None, sortkey1=None, sortkey2=None, table_columns=None, credential_id=None, primary_keys=None, last_modified_keys=None, execution='immediate', polling_interval=None, hidden=True, **kwargs)[source]
Upload a pandas dataframe into a Civis table.
The dataframe’s index will not be included. To store the index along with the other values, use
df.reset_index()
instead ofdf
as the first argument to this function.- Parameters:
- df
pandas.DataFrame
The DataFrame to upload to Civis.
- databasestr or int
Upload data into this database. Can be the database name or ID.
- tablestr
The schema and table you want to upload to. E.g.,
'scratch.table'
. Schemas or tablenames with periods must be double quoted, e.g.'scratch."my.table"'
.- client
civis.APIClient
, optional If not provided, an
civis.APIClient
object will be created from theCIVIS_API_KEY
.- max_errorsint, optional
The maximum number of rows with errors to remove from the import before failing.
- existing_table_rowsstr, optional
The behaviour if a table with the requested name already exists. One of
'fail'
,'truncate'
,'append'
,'drop'
, or'upsert'
. Defaults to'fail'
.- diststylestr, optional
The distribution style for the table. One of
'even'
,'all'
or'key'
.- distkeystr, optional
The column to use as the distkey for the table.
- sortkey1str, optional
The column to use as the sortkey for the table.
- sortkey2str, optional
The second column in a compound sortkey for the table.
- table_columnslist[Dict[str, str]], optional
A list of dictionaries, ordered so each dictionary corresponds to a column in the order that it appears in the source file. Each dict should have a key “name” that corresponds to the column name in the destination table, and a key “sql_type” corresponding to the intended column data type in the destination table. The “sql_type” key is not required when appending to an existing table. The table_columns parameter is required if the table does not exist, the table is being dropped, or the columns in the source file do not appear in the same order as in the destination table. Example:
[{"name": "foo", "sql_type": "INT"}, {"name": "bar", "sql_type": "VARCHAR"}]
- credential_idstr or int, optional
The ID of the database credential. If
None
, the default credential will be used.- primary_keys: list[str], optional
A list of the primary key column(s) of the destination table that uniquely identify a record. These columns must not contain null values. If existing_table_rows is “upsert”, this field is required. Note that this is true regardless of whether the destination database itself requires a primary key.
- last_modified_keys: list[str], optional
A list of the columns indicating a record has been updated. If existing_table_rows is “upsert”, this field is required.
- execution: string, optional, default “immediate”
One of “delayed” or “immediate”. If “immediate”, refresh column statistics as part of the run. If “delayed”, flag the table for a deferred statistics update; column statistics may not be available for up to 24 hours. In addition, if existing_table_rows is “upsert”, delayed executions move data from staging table to final table after a brief delay, in order to accommodate multiple concurrent imports to the same destination table.
- polling_intervalint or float, optional
Number of seconds to wait between checks for job completion.
- hiddenbool, optional
If
True
(the default), this job will not appear in the Civis UI.- **kwargskwargs
Extra keyword arguments will be passed to
pandas.DataFrame.to_csv()
.
- df
- Returns:
- fut
CivisFuture
A
CivisFuture
object.
- fut
See also
to_csv()
Examples
>>> import civis >>> import pandas as pd >>> df = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]}) >>> fut = civis.io.dataframe_to_civis(df, 'my-database', ... 'scratch.df_table') >>> fut.result()