kennethreitz / records Goto Github PK

View Code? Open in Web Editor NEW

7.1K 186.0 570.0 310 KB

SQL for Humans™

Home Page: https://pypi.python.org/pypi/records/

License: ISC License

Python 99.17% Makefile 0.83%

sql postgres sqlalchemy python orm schemas kennethreitz forhumans

records's Introduction

Records: SQL for Humans™

Records is a very simple, but powerful, library for making raw SQL queries to most relational databases.

Just write SQL. No bells, no whistles. This common task can be surprisingly difficult with the standard tools available. This library strives to make this workflow as simple as possible, while providing an elegant interface to work with your query results.

Database support includes RedShift, Postgres, MySQL, SQLite, Oracle, and MS-SQL (drivers not included).

☤ The Basics

We know how to write SQL, so let's send some to our database:

import records

db = records.Database('postgres://...')
rows = db.query('select * from active_users')    # or db.query_file('sqls/active-users.sql')

Grab one row at a time:

>>> rows[0]
<Record {"username": "model-t", "active": true, "name": "Henry Ford", "user_email": "[email protected]", "timezone": "2016-02-06 22:28:23.894202"}>

Or iterate over them:

for r in rows:
    print(r.name, r.user_email)

Values can be accessed many ways: row.user_email, row['user_email'], or row[3].

Fields with non-alphanumeric characters (like spaces) are also fully supported.

Or store a copy of your record collection for later reference:

>>> rows.all()
[<Record {"username": ...}>, <Record {"username": ...}>, <Record {"username": ...}>, ...]

If you're only expecting one result:

>>> rows.first()
<Record {"username": ...}>

Other options include rows.as_dict() and rows.as_dict(ordered=True).

☤ Features

Iterated rows are cached for future reference.
$DATABASE_URL environment variable support.
Convenience Database.get_table_names method.
Command-line records tool for exporting queries.
Safe parameterization: Database.query('life=:everything', everything=42).
Queries can be passed as strings or filenames, parameters supported.
Transactions: t = Database.transaction(); t.commit().
Bulk actions: Database.bulk_query() & Database.bulk_query_file().

Records is proudly powered by SQLAlchemy and Tablib.

☤ Data Export Functionality

Records also features full Tablib integration, and allows you to export your results to CSV, XLS, JSON, HTML Tables, YAML, or Pandas DataFrames with a single line of code. Excellent for sharing data with friends, or generating reports.

>>> print(rows.dataset)
username|active|name      |user_email       |timezone
--------|------|----------|-----------------|--------------------------
model-t |True  |Henry Ford|[email protected]|2016-02-06 22:28:23.894202
...

Comma Separated Values (CSV)

>>> print(rows.export('csv'))
username,active,name,user_email,timezone
model-t,True,Henry Ford,[email protected],2016-02-06 22:28:23.894202
...

YAML Ain't Markup Language (YAML)

>>> print(rows.export('yaml'))
- {active: true, name: Henry Ford, timezone: '2016-02-06 22:28:23.894202', user_email: model-t@gmail.com, username: model-t}
...

JavaScript Object Notation (JSON)

>>> print(rows.export('json'))
[{"username": "model-t", "active": true, "name": "Henry Ford", "user_email": "[email protected]", "timezone": "2016-02-06 22:28:23.894202"}, ...]

Microsoft Excel (xls, xlsx)

with open('report.xls', 'wb') as f:
    f.write(rows.export('xls'))

Pandas DataFrame

>>> rows.export('df')
    username  active       name        user_email                   timezone
0    model-t    True Henry Ford model-t@gmail.com 2016-02-06 22:28:23.894202

You get the point. All other features of Tablib are also available, so you can sort results, add/remove columns/rows, remove duplicates, transpose the table, add separators, slice data by column, and more.

See the Tablib Documentation for more details.

☤ Installation

Of course, the recommended installation method is pipenv:

$ pipenv install records[pandas]
✨🍰✨

☤ Thank You

Thanks for checking this library out! I hope you find it useful.

Of course, there's always room for improvement. Feel free to open an issue so we can make Records better, stronger, faster.

records's People

Contributors

Stargazers

Watchers

Forkers

jmyles kenzk gxr ananthasharma btubbs lorenanicole mesutcan ghickman snazz2001 timeu rlugojr mcdallas djangosporti timofurrer kleiram rakesh-mohanta ankurdedania intfrr pathcl johan-- raleighgee zelark inndy sk2 simonrw njbhstone johnwheeler duoduo369 eyalev brobin fgcarto sanjeevphd lkarthee prasadph fictorial dxq-git anderson89marques michael-iclinic linearregression brianpainter pavelpascari jmcs yesimon rubeniskov lipi datnamer baalle2015 mbrukman dbarenas why-not-sky gogobook uruscg-llc pombredanne willyhakim anderscui mrhwilson jack775544 kiney severedsurvival dogtperfect micjagga lzm7455 nano3k0a yutianmengmeng nixiaocang zhiruchen 124642027 662d7 ekkoliu aaronyuseu maba18 zh3036 xuetao-yu zhangcabbage baifei7699 pengshuang allianzcortex wujiu913 dirkakrid wonderay bigfreecoder dennisfire kindsof xx123er oldfisheryork koumdros nickyang86 mengzhijiguang michealli09 neo-nie kmmao zfx168 kuonanhong kingleoric2010 hunterjackson rareyang strogo zqsong2012 xiaonuogantan bo5509

records's Issues

iLIKE support?

Not a big deal, but why does db.query('select * from papers where author iLIKE "%childress%"') throw an error (specifically: sqlite3.OperationalError: near "iLIKE": syntax error).

db.query('select * from papers where LOWER(author) LIKE "%childress%"') works fine, so this is a minor issue.

test and fix error handling in query_file

There's some NameErrors in there.

Use Database() as context manager

We could move

self.db = psycopg2.connect(self.db_url, cursor_factory=BetterNamedTupleCursor)

to the enter method and add a

    def __exit__(self):
        self.db.close()

so Database can be used with the with keyword

example:

with records.Database('postgres://...') as db:
    data = db.query('select * from active_users') .all()

Support for SQLite

Support for SQLite should straight forward since DB-API interface is built-in.

neutralize README text

A couple sentences toward the end start with "Of course", but the sentences could IMO be rewritten without those parts:

Of course, all other features of Tablib are also available

Of course, the recommended installation method is pip:

Removing "Of course" from those sentences still conveys the same amount of information. The README, as currently written, contains things that are obvious to you, but might not be to the casual reader.

As always, thanks for the great work

The name of the actor that played Han Solo

Harrison Ford
Henry Ford

ResultSet.next() discards rows

If you call next() before all() or __iter()__ then the returned row will not be included in _all_rows. Maybe this is intentional, but it's potentially confusing.

Gevent support

Does records support greenlets/gevent? I had inserts working on my DB using records and then when i wrapped the code inside a greenlet spawn the code executes but the DB shows no writes.

automatically dereference when only one key/value

Over in Postgres.py we have an API feature where we automatically dereference single-column queries. Do you want that feature, @kennethreitz? It would mean this:

>>> db.query("select foo from bar").all()
[2, 4, 6, 8]

Instead of this:

>>> db.query("select foo from bar").all()
[<Record "foo": 2>, <Record "foo": 4>, <Record "foo": 6>, <Record "foo": 8>]

Need help with understanding `:key` syntax.

I cannot quite figure out how to insert locals into a query.

...: db.query('INSERT INTO persons VALUES(:name, :email)', name=name, email=email)

In [1]: import records as rc, requests as rq, json

In [2]: db = rc.Database('sqlite:///users.db')

In [3]: r = rq.get('http://api.randomuser.me/?nat=us&results=10')

In [4]: j = r.json()['results']

In [5]: for rec in j:
   ...:     user = rec['user']
   ...:     name = user['name']
   ...:     email = user['email']
   ...:     db.query('CREATE TABLE IF NOT EXISTS persons (name text, email text)')
   ...:     db.query('INSERT INTO persons VALUES(:name, :email)', name=name, email=email)
   ...:
---------------------------------------------------------------------------
InterfaceError                            Traceback (most recent call last)
/Users/thill/projects/play/python/charity/venv_charity/lib/python3.5/site-packages/sqlalchemy/engine/base.py in _execute_context(self, dialect, constructor, statement, parameters, *args)
   1138                         parameters,
-> 1139                         context)
   1140         except Exception as e:

/Users/thill/projects/play/python/charity/venv_charity/lib/python3.5/site-packages/sqlalchemy/engine/default.py in do_execute(self, cursor, statement, parameters, context)
    449     def do_execute(self, cursor, statement, parameters, context=None):
--> 450         cursor.execute(statement, parameters)
    451

InterfaceError: Error binding parameter 0 - probably unsupported type.

The above exception was the direct cause of the following exception:

InterfaceError                            Traceback (most recent call last)
<ipython-input-6-7e8a8900c51d> in <module>()
      4     email = user['email']
      5     db.query('CREATE TABLE IF NOT EXISTS persons (name text, email text)')
----> 6     db.query('INSERT INTO persons VALUES(:name, :email)', name=name, email=email)

Do I have to use SQL? Can I use the ORM in SQLalchemy?

I want to use ORM. I use SQLAlchemy only for the ORM. Does records support ORM or only SQL commands.

LastRowID

How can I get last row id?

Error from query with zero rows results via CLI

If I do a query (via the CLI interface), which returns zero rows, an IndexError is raised.

Example:

python records.py 'SELECT * FROM empty_table;' json --url=postgresql://postgres:[email protected]/database

This raises the following error:

Traceback (most recent call last):
  File "records.py", line 357, in <module>
    cli()
  File "records.py", line 351, in cli
    print(rows.export(arguments['<format>']))
  File "records.py", line 162, in export
    return self.dataset.export(format, **kwargs)
  File "records.py", line 171, in dataset
    first = self[0]
  File "records.py", line 153, in __getitem__
    return rows[0]
IndexError: list index out of range

race condition between ResultSet's all() and iter()

If you partially iterate over a ResultSet, and then call all(), it will return only what has been iterated over so far, not "all rows" as specified, i.e. something like:

rows = db.query('select * from active_users')
it = iter(rows)
print(next(it))  # row 1
print(rows.all())  # [row 1]
print(next(it))  # row 2
print(rows.all())  # [row 1, row 2]

A simple fix would be:

def all(self):
    # If rows aren't cached, fetch them.
    if not self._completed:
        self._all_rows.extend(self._rows)
        self._completed = True
    return self._all_rows

…but I have a feeling there will be a more elegant solution once you've fixed #13 🚗

parameter binding doesn't work well with IN clause

db.query('SELECT * FROM order WHERE pay_status IN (:pay_status) LIMIT :n', pay_status = (1, 2), n = 100)

It reports exception:

Traceback (most recent call last):
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context
    context)
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/sqlalchemy/engine/default.py", line 450, in do_execute
    cursor.execute(statement, parameters)
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/MySQLdb/cursors.py", line 250, in execute
    self.errorhandler(self, exc, value)
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/MySQLdb/connections.py", line 42, in defaulterrorhandler
    raise errorvalue
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/MySQLdb/cursors.py", line 247, in execute
    res = self._query(query)
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/MySQLdb/cursors.py", line 411, in _query
    rowcount = self._do_query(q)
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/MySQLdb/cursors.py", line 374, in _do_query
    db.query(q)
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/MySQLdb/connections.py", line 270, in query
    _mysql.connection.query(self, query)
_mysql_exceptions.OperationalError: (1241, 'Operand should contain 1 column(s)')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "a.py", line 6, in <module>
    rows = db02.query('SELECT * FROM order WHERE pay_status IN (:pay_status) LIMIT :n', pay_status = (1, 2), n=100)
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/records.py", line 237, in query
    cursor = self.db.execute(text(query), **params) # TODO: PARAMS GO HERE
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 914, in execute
    return meth(self, multiparams, params)
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/sqlalchemy/sql/elements.py", line 323, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 1010, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 1146, in _execute_context
    context)
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 1341, in _handle_dbapi_exception
    exc_info
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/sqlalchemy/util/compat.py", line 202, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/sqlalchemy/util/compat.py", line 185, in reraise
    raise value.with_traceback(tb)
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context
    context)
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/sqlalchemy/engine/default.py", line 450, in do_execute
    cursor.execute(statement, parameters)
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/MySQLdb/cursors.py", line 250, in execute
    self.errorhandler(self, exc, value)
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/MySQLdb/connections.py", line 42, in defaulterrorhandler
    raise errorvalue
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/MySQLdb/cursors.py", line 247, in execute
    res = self._query(query)
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/MySQLdb/cursors.py", line 411, in _query
    rowcount = self._do_query(q)
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/MySQLdb/cursors.py", line 374, in _do_query
    db.query(q)
  File "/Users/liuyb/anaconda/lib/python3.5/site-packages/MySQLdb/connections.py", line 270, in query
    _mysql.connection.query(self, query)
sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (1241, 'Operand should contain 1 column(s)') [SQL: 'SELECT * FROM order WHERE pay_status IN (%s) LIMIT %s'] [parameters: ((1, 2), 100)]

Style Check Failure in setup.py

Hello,

Noticed that in setup.py

're' was imported but never used.
function definition at line 19 needs to be at line 20 as it need two blank spaces.
there is are spaces around the '=' operator in line 34

Corrected the same in the pull #87

Thanks and Regards
V Sandeep Srivastav

Support for JDBI style folder

JDBI is one of the saner database access libraries in the Java world. To solve the N+1 project (e.g. user with permissions) it allows the developer to specify a function that can fold rows into a hierarchy.

For example, say we have:

select u.*, p.user_perm from users u inner join permissions p on (p.user_id) where u.id=X

and let's say that the result is:

1             foo@b   CAN_DO_X
1             foo@b   CAN_DO_Y
1             foo@b   CAN_DO_Z

A folder would be a function that takes a row at a time, which also is passed in an accumulator (a dict of id -> typed object, in this case id->user).

For each row passed in, it would check the accumulator to see if the base object (user) existed. If not, it would make the base user object from the row and add it to the accumulator. It would then unpack the permission from the row, and add it to the list of permissions in the user object.

The output result of the user_folder would be a list of unique users, which might only be one user in this case, each of whom would have a list of permissions attached to them.

The beauty of this for you is that it doesn't require many changes, but it gives records support for N+1 queries very efficiently. You could add an option "folder" param to the query function; if supplied, call it once per row, and then return the values from the dict.

Bug with table headings

Database query:

rows = db.query("select * from users")

if he does not return the headers of the table, the result:

and if you try to export the data, there will be an error

because the table did not return headers, and they are trying to take

def dataset(self):
        """A Tablib Dataset representation of the ResultSet."""
        # Create a new Tablib Dataset.
        data = tablib.Dataset()

        # Set the column names as headers on Tablib Dataset.
        first = self[0]

        data.headers = first._fields

Is sqlalchemy an implementation detail?

Currently, the only way I can see to catch things like foreign key constraint is to catch the sqlalchemy.exc.IntegrityError raised and handle it. However, this makes me uncomfortable since there is no indication if sqlalchemy is currently just an implementation detail or a core part of records.

Can I safely handle exceptions raised by sqlalchemy and call it a day or should I be worried about stuff breaking if sqlalchemy gets swapped out for something else? If no, is there an official way of catching errors raised by foreign key constraints?

is there a way of create or update a record with a dictionary for data?

By example:

dict_data = {'table': 'table_1', 'set': {'name': 'Peter', 'last_name' : 'Smith'}, 'where': [('id', '=', '4')] }
db.update(dict_data)

dict_data = {'table': 'table_1', 'columns': {'name': 'Peter', 'last_name' : 'Smith'} }
db.create(dict_data)

I can be saying crazy. But it might be possible.

Delete rows

Hi,
My script is hanging once I try to delete some rows
sql_clause = 'DELETE FROM {0} WHERE interval_start = :interval_datetime'.format(self.name) self.database.query(sql_clause, interval_datetime=interval_datetime)

where database is a records.Database object.

How can you use records to delete rows?
thx
Sephi

Can't import [python3]

I use ubuntu 14.04 LTS. I installed the library system-wide, using sudo pip3 install records, it installed successfully. But when I try to import it (in REPL), I get the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.4/dist-packages/records.py", line 6, in <module>
    import tablib
  File "/usr/local/lib/python3.4/dist-packages/tablib/__init__.py", line 3, in <module>
    from tablib.core import (
  File "/usr/local/lib/python3.4/dist-packages/tablib/core.py", line 15, in <module>
    from tablib import formats
  File "/usr/local/lib/python3.4/dist-packages/tablib/formats/__init__.py", line 6, in <module>
    from . import _csv as csv
  File "/usr/local/lib/python3.4/dist-packages/tablib/formats/_csv.py", line 6, in <module>
    from tablib.compat import is_py3, csv, StringIO
  File "/usr/local/lib/python3.4/dist-packages/tablib/compat.py", line 25, in <module>
    import tablib.packages.xlwt3 as xlwt
ImportError: No module named 'tablib.packages.xlwt3'

I checked, and tablib is installed. I installed xlwt3 manually, but that didn't solve (or change) the problem.

ResultSet slice can steal its parent's rows

Since you give the underlying iterable to the child ResultSet, rows can get consumed by the child without being cached in the parent, for example:

rows = db.query('select * from active_users limit 10')
list(rows[:5])  # all 10 rows
list(rows)  # only first 5 rows, because the rest were consumed by the child

The easiest fix is to not let slices go beyond their bounds, which imho makes more sense anyway. I'll make a quick pull request with that approach.

I also have some regression tests for these various iteration issues I've been reporting, I'll open a request to add those too.

ProgrammingError: Commands out of sync

I get the following error:

sqlalchemy.exc.ProgrammingError: (_mysql_exceptions.ProgrammingError) (2014, "Commands out of sync; you can't run this command now")

The Python file:

db = records.Database('mysql://root:@localhost/school')
db.query_file('data/sample.schema.sql')

The schema I executed is this:

DROP TABLE IF EXISTS `Student`;
CREATE TABLE IF NOT EXISTS `Student` (
`id` INTEGER PRIMARY KEY AUTOINCREMENT,
`created` DATETIME NOT NULL DEFAULT (DATETIME('NOW','LOCALTIME')),
`first_name` VARCHAR(255) NOT NULL,
`last_name` VARCHAR(255) NOT NULL,
`birth_date` DATE NOT NULL,
`sex` VARCHAR(255) NOT NULL,
`photo` BLOB,
`remarks` TEXT);

Screenshot in readme missing alt text

The screenshot of the usage information is missing its alt attribute.
As a result, my screen reader will read its URL, which isn't very useful.

Sanitisation

I was sought of under the impression that some sort of sanitisation happens, if you use query to insert your data.

db.query("SELECT * FROM users WHERE id = :user", user="Te'ArnaLambert")
> sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) near "ArnaLambert": syntax error [SQL: "SELECT * FROM users WHERE id = 'Te'ArnaLambert' ]

Is this not the case? If so, it seems a major problem for records which is all about text SQL.

Advice for batch inserts?

Hi,

Does anyone have advice for simplifying the task of performing a batch insert, such as the following MySQL example?

INSERT IGNORE
INTO t
    (:c0, :c1)
VALUES
    (:r_0_0, :r_0_1),
    (:r_1_0, :r_1_1)
;

It's a bit of a hassle to generate all of those param names. I suspect we all have reinvented similar helper functions for it. Is there any potential for records to simplify any part of this operation so that it's more for humans™?

Import from csv, excel, etc

I do a lot data work which requires importing and exporting csv's. This library looks extremely useful, but it doesn't import from csv. If it did, it'd probably be a tool I used every day.

Logistically, I'm not even sure what this would look like, but if it's something possible, it'd be great!

Thanks for another great tool.

Any way to pass connect_args={'sslmode': 'prefer'} with DB_URL?

I have to connect to Amazon redshift via database URI over a tunnel. Thus, I need to use:

connect_args={'sslmode': 'prefer'}

passed down to the sqlalchemy-redshift driver but I don't see any way to do this with the constructor:

   db = records.Database(db_url=db_url)

as documented here:

binarymatt/redshift_sqlalchemy#36

I'm confused because the constructor here:

records/records.py

Line 199 in a49f8aa

def __init__(self, db_url=None, **kwargs):

clearly takes a kwargs argument and passes it onto the underlying engine which would solve my problem, but none of the versions I get installed via pip take it. I've tried pip installing via git but that doesn't seem to get the version in the aforementioned link either.

Any ideas?

Thanks!

Where is this going?

Just curious about the plan for this library going forward:

Do you want this to be an ORM built on top of psycopg2?
Do you want to add more database drivers?
Do you want this to become an ORM for multiple databases or just Postgres?
Are you waiting to see where the community takes it? 😄

Sending 😍 for Requests!!! Maybe this turns into something like along those lines?

records.Database UNIX socket connection string

What's the correct connection string for connecting to PostgreSQL via a UNIX socket?

Not clear how to install drivers

This probably python/sql 101 for many people but when trying Records out, I received a message saying it didn't have a protocol handler for mysqldb. It's been a real struggle to understand that I need to install a driver, to find a suitable version and install it on a macbook (I gave up after several hours) and eventually to install the driver from the mysql site. I still don't know if this will work with Records or whether I'll have to use it in its raw form.

Failure on json export

I have the following table in a postgresql database:

elzilncu=> \d donor
            Table "public.donor"
 Column  |         Type          | Modifiers
---------+-----------------------+-----------
 donorno | integer               | not null
 dlname  | character varying(15) |
 dfname  | character varying(15) |
 dphone  | numeric(4,0)          |
 dstate  | character(2)          |
 dcity   | character varying(15) |
Indexes:
    "pk_donor" PRIMARY KEY, btree (donorno)
Referenced by:
    TABLE "gift" CONSTRAINT "fk_donatedby" FOREIGN KEY (donorno) REFERENCES donor(donorno)

That I can dump pretty easily in CSV:

❯❯❯ records "select * from donor" csv
donorno,dlname,dfname,dphone,dstate,dcity
101,Abrams,Louis,9018,GA,London
102,Aldinger,Dmitry,1521,GA,Paris
103,Beckman,Gulsen,8247,WA,Sao Paulo
104,Berdahl,Samuel,8149,WI,Sydney
105,Borneman,Joanna,1888,MD,Bombay
106,Brock,Scott,2142,AL,London
107,Buyert,Aylin,9355,AK,New York
108,Cetinsoy,Girwan,6346,AZ,Rome
109,Chisholm,John,4482,MA,Oslo
110,Crowder,Anthony,6513,NC,Stockholm
111,Dishman,Michelle,3903,NC,Helsinki
112,Duke,Peter,4939,FL,Tokyo
113,Evans,Ann,4336,GA,Singapore
114,Frawley,Todd,4785,MN,Perth
115,Guo,John,6247,MN,Moscow
116,Hammann,John,5369,ND,Kabaul
117,Hays,Cami,1352,SD,Lima
118,Herskowitz,Thomas,6872,MT,London
119,Jefts,Robert,8103,ME,Oslo

But the same operation fails if I try to export to JSON or YAML:

❯❯❯ records "select * from donor" json
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/site-packages/tablib/packages/omnijson/core.py", line 63, in dumps
    return _engine[1](o)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/__init__.py", line 230, in dumps
    return _default_encoder.encode(obj)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/encoder.py", line 180, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: Decimal('9018') is not JSON serializable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/records", line 11, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.5/site-packages/records.py", line 345, in cli
    print(rows.export(arguments['<format>']))
  File "/usr/local/lib/python3.5/site-packages/records.py", line 160, in export
    return self.dataset.export(format, **kwargs)
  File "/usr/local/lib/python3.5/site-packages/tablib/core.py", line 464, in export
    return export_set(self, **kwargs)
  File "/usr/local/lib/python3.5/site-packages/tablib/formats/_json.py", line 22, in export_set
    return json.dumps(dataset.dict, default=date_handler)
  File "/usr/local/lib/python3.5/site-packages/tablib/packages/omnijson/core.py", line 69, in dumps
    raise JSONError(why)
tablib.packages.omnijson.core.JSONError: Decimal('9018') is not JSON serializable

Here is the full stack trace from Python:

In [3]: rows.export('json')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-3-5693f926b1b0> in <module>()
----> 1 rows.export('json')

NameError: name 'rows' is not defined

In [4]: rows = db.query('select * from donor')

In [5]: rows.export('json')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/usr/local/lib/python3.5/site-packages/tablib/packages/omnijson/core.py in dumps(o, **kwargs)
     62     try:
---> 63         return _engine[1](o)
     64

/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/__init__.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    229         default is None and not sort_keys and not kw):
--> 230         return _default_encoder.encode(obj)
    231     if cls is None:

/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/encoder.py in encode(self, o)
    198         # equivalent to the PySequence_Fast that ''.join() would do.
--> 199         chunks = self.iterencode(o, _one_shot=True)
    200         if not isinstance(chunks, (list, tuple)):

/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/encoder.py in iterencode(self, o, _one_shot)
    256                 self.skipkeys, _one_shot)
--> 257         return _iterencode(o, 0)
    258

/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/encoder.py in default(self, o)
    179         """
--> 180         raise TypeError(repr(o) + " is not JSON serializable")
    181

TypeError: Decimal('9018') is not JSON serializable

During handling of the above exception, another exception occurred:

JSONError                                 Traceback (most recent call last)
<ipython-input-5-5693f926b1b0> in <module>()
----> 1 rows.export('json')

/usr/local/lib/python3.5/site-packages/records.py in export(self, format, **kwargs)
    158     def export(self, format, **kwargs):
    159         """Export the RecordCollection to a given format (courtesy of Tablib)."""
--> 160         return self.dataset.export(format, **kwargs)
    161
    162     @property

/usr/local/lib/python3.5/site-packages/tablib/core.py in export(self, format, **kwargs)
    462             raise UnsupportedFormat('Format {0} cannot be exported.'.format(format))
    463
--> 464         return export_set(self, **kwargs)
    465
    466     # -------

/usr/local/lib/python3.5/site-packages/tablib/formats/_json.py in export_set(dataset)
     20 def export_set(dataset):
     21     """Returns JSON representation of Dataset."""
---> 22     return json.dumps(dataset.dict, default=date_handler)
     23
     24

/usr/local/lib/python3.5/site-packages/tablib/packages/omnijson/core.py in dumps(o, **kwargs)
     67
     68         if any([(issubclass(ExceptionClass, e)) for e in _engine[2]]):
---> 69             raise JSONError(why)
     70         else:
     71             raise why

JSONError: Decimal('9018') is not JSON serializable

Additional info:

In [18]: r = rows.next()

In [19]: r.keys()
Out[19]: ['donorno', 'dlname', 'dfname', 'dphone', 'dstate', 'dcity']

In [20]: r.get('dfname')
Out[20]: 'Dmitry'

In [21]: r.get('dphone')
Out[21]: Decimal('1521')

In [22]: r.get('donorno')
Out[22]: 102

Even more info:

In [25]: from sqlalchemy import *

In [27]: db = create_engine(dbUrl)

In [29]: cn = db.connect()

In [30]: db.name()

In [33]: res = db.execute("select * from donor")

In [38]: res.first()
Out[38]: (101, 'Abrams', 'Louis', Decimal('9018'), 'GA', 'London')

The error may actually originate in Tablib or SQLAlchemy. If so, please let me know and I will move the issue.

should use self.db or self._db

Hello,Kennethreitz,Now I'm learning your code to try to write Pythonic code :-D

One small question is whether should use self._db instead self.db in class Database,because we don't need it to expose to users,just use in internal.

Thanks for your help~~

Add parameters for constructing the Database url

Some people might not know about host url (I had to look it up), maybe we could use parameters to construct the connection url, something like:

class Database(object):
    """A Database connection."""

def __init__(self, host='localhost', port=5432, user=None, pass=None):

use Postgres.py under the hood

Over in Gratipay-land we've got a library called Postgres.py that has a very similar scope to Records-the-library (it has no CLI). It's been in production for three or four years, and as you say, it's "much more robust" than Records from an API point of view. I know you're opinionated about API design. I believe Postgres.py's API is worthy (or nearly so) of the "for Humans" appellation (and I'm not sure I didn't actually run that by you at some point in the past ;-).

I like the Tablib integration and CLI that I see here on Records, and our APIs are pretty close.

@kennethreitz Would you be open to a PR that switches Records to use Postgres.py under the hood? I'd envision keeping the CLI exactly the same. I think we'd need to talk through API—what gets wrapped, what gets exposed, what might be upstreamed.

Waddya say? Up for collaborating? :-)

race condition in ResultSet.iter()

If you create two or more iterators over a ResultSet, then they consume ResultSet._rows together, meaning they all iterate over different rows. I.e. something like:

rows = db.query('select * from active_users')
itera = iter(rows)
iterb = iter(rows)
print(next(itera))  # row 1
print(next(iterb))  # row 2

Of course, once all rows have been iterated over and fully cached, then iterators behave correctly, since they are independently iterating over the ResultSet._all_rows list, something like:

rows = db.query('select * from active_users')
for row in rows: continue
itera = iter(rows)
iterb = iter(rows)
print(next(itera))  # row 1
print(next(iterb))  # row 1

Maybe this is intentional, but to me it seems confusing / not a standard iterator behavior. 🚌

Support for Executing Multiple Statements

Hi,
I am trying to use the records query interface on small project, but sadly it lacks support for executing multiple statements.
I am encountering this when trying to run SQL already stored in a file.
For example, when trying to run the following SQL for creating tables:

CREATE TABLE IF NOT EXISTS "rooms" (
    "id" INTEGER PRIMARY KEY AUTOINCREMENT,
    "name" TEXT UNIQUE,
    "type" TEXT
);
CREATE TABLE IF NOT EXISTS "people" (
    "id" INTEGER PRIMARY KEY AUTOINCREMENT,
    "name" TEXT,
    "accomodation" TEXT
);

Trying to run db.query_file('queries/create-tables.sql') fails with the following error:

sqlite3.Warning: You can only execute one statement at a time.

Is there a way to get support for executing such multiple statements at once?
Thanks for creating this package.

possible to connect to SQLCipher encrypted DB?

As I understand, it's not possible, but maybe I missed something?

'RecordCollection' object has no attribute 'as_dict'

Following the usage example in the README:

import records

db = records.Database('sqlite:///test.db')
rows = db.query('select * from active_users')
rows.as_dict()

I am not able to use the as_dict method and it fails with the following error:
AttributeError: 'RecordCollection' object has no attribute 'as_dict'

Do you plan to add support for MySQL?

cli output to .xls format is cast to a string in Python 3

If, when using Python 3, you try to output to Excel using the command line interface:

records 'select * from T' xls > output.xls

the file output.xls will not be binary-formatted, but will instead be a string beginning with b'\xd0\x...etc' because the byte string is cast to a string by Python 3's print function. Here's a quick but ugly patch. But maybe this should be imported as a short function from tablib instead. If so, I can add an issue to tablib.

This is the function that could either go here or in tablib:

from sys import stdout

def print_bytes(content):
    try:
        stdout.buffer.write(content)
    except AttributeError:
        stdout.write(content)

And it can be used inside the cli() function:

     # Print results in desired format.
     if arguments['<format>']:
        ## print(rows.export(arguments['<format>']))
        print_bytes(rows.export(arguments['<format>']))
     else:
         print(rows.dataset)

Maybe it's worth to make a version for the terminal?

Sometimes it is convenient without touching the database view table or take data in .json, .html, .csv, .yaml or .xls
Write the script for this for a long time, but these programs do not have to Unix
So users can be given together with a library of more quick way to import and export tables

Column mismatch on export

records==0.2.0
Postgres 9.4.6

Table schema:

postgres@production=# \d cell_per
                          Table "public.cell_per"
     Column     |              Type              |        Modifiers         
----------------+--------------------------------+--------------------------
 line_id        | integer                        | not null
 category       | character varying(10)          | not null
 cell_per       | integer                        | not null
 ts_insert      | timestamp(0) without time zone | default now()
 ts_update      | timestamp(0) without time zone | 
 user_insert    | character varying(20)          | default "session_user"()
 user_update    | character varying(20)          | 
 plant_type     | character varying(6)           | not null
 season         | character varying(9)           | not null
 short_category | text                           | not null
Indexes:
    "cell_per_pkey" PRIMARY KEY, btree (line_id)
Foreign-key constraints:
    "fk_category" FOREIGN KEY (category) REFERENCES category(category) ON UPDATE CASCADE
Triggers:
    cell_per_ts_update BEFORE UPDATE ON cell_per FOR EACH ROW EXECUTE PROCEDURE ts_update()
    cell_per_user_update BEFORE UPDATE ON cell_per FOR EACH ROW EXECUTE PROCEDURE user_update()

Query:

db = records.Database('postgresql://aklaver:@localhost/production')
rs = db.query('select * from cell_per')

>>> rs[0].line_id 
4

>>> print rs.export('csv')
line_id,category,cell_per,ts_insert,ts_update,user_insert,user_update,plant_type,season,short_category
2004-06-02T15:11:26,HERB 3.5,18,,2004-06-02 15:11:26,,postgres,herb,none,HR3

When exporting the line_id takes on the value of ts_update. This happens across multiple export formats.

CLI format parameter should be optional but is required

It appears the docopt docstring for the cli function is incorrect.
Line 327 has format as a required parameter. Lines 385-389 check if arguments[''] is set and pritns rows.dataset if they are not, which would indicate that the format parameter should be optional. Further, the example on line 344 lacks the format argument.

Suggested fix is to make line 327 as follows:
Line 327: records < query > [< format >] [ < params > ...] [--url = < url >]

Current:
Line 327: records < query > < format > [ < params > ...] [--url = < url >]

Line 385-389:
# Print results in desired format.
if arguments['']:
print(rows.export(arguments['']))
else:
print(rows.dataset)

Edit:
I submitted a pull request with this change.

utf-8 cahracters

When utf-8 specific characters were present in the query (which was unicode instance as well), this exception was raised:

  File "/Library/Python/2.7/site-packages/records.py", line 237, in query
    cursor = self.db.execute(text(query), **params) # TODO: PARAMS GO HERE
  File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 914, in execute
    return meth(self, multiparams, params)
  File "/Library/Python/2.7/site-packages/sqlalchemy/sql/elements.py", line 323, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 1010, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 1146, in _execute_context
    context)
  File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 1344, in _handle_dbapi_exception
    util.reraise(*exc_info)
  File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context
    context)
  File "/Library/Python/2.7/site-packages/sqlalchemy/engine/default.py", line 450, in do_execute
    cursor.execute(statement, parameters)
  File "/Library/Python/2.7/site-packages/MySQLdb/cursors.py", line 181, in execute
    query = query.encode(db.unicode_literal.charset)
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u017e' in position 107: ordinal not in range(256)

Support for Oracle (cx_Oracle)?

Saw requests for MySQL and SQLite, thought I'd throw in a request for Oracle.

get_table_names implementation

The implementation for get_table_names is odd.

The following steps are executed every time, making the function much more expensive than it needs to be:

a new engine is created rather than using the existing one to check out a connection
a declarative base is created just to use its metadata, which is better created using sqlalchemy.MetaData()
The whole database schema is reflected just to get the table names.

The fix would look something like:

from sqlalchemy import inspect

def __init__(...):
    engine = create_engine(...)
    self.db = engine.connect()
    self._engine = engine

def get_table_names():
    return inspect(self._engine).get_table_names()

Connection is not been closed

Hi guys. I'm having trouble to close the connection with postgres. And the live connections always goes up.

I already implemented does methods here. But the connection is never closed nor reutilized.
`
def enter(self):
return self

def __exit__(self, exc, val, traceback):
    if self.db:
        self.db.close()

Could you tell me what I am doing wrong?

CLI records -h doesn't work

I'm unable to get the CLI records to display full help with -h or --help option, it throws the following error.

PS D:\APPS\Python\3.5> records -h
Traceback (most recent call last):
  File "d:\apps\python\3.5\lib\runpy.py", line 170, in _run_module_as_main
    "__main__", mod_spec)
  File "d:\apps\python\3.5\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "D:\APPS\Python\3.5\Scripts\records.exe\__main__.py", line 9, in <module>
  File "d:\apps\python\3.5\lib\site-packages\records.py", line 318, in cli
    arguments = docopt(cli_docs)
  File "d:\apps\python\3.5\lib\site-packages\docopt.py", line 575, in docopt
    extras(help, version, argv, doc)
  File "d:\apps\python\3.5\lib\site-packages\docopt.py", line 481, in extras
    print(doc.strip("\n"))
  File "d:\apps\python\3.5\lib\encodings\cp437.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2122' in position 23: character maps to <undefined>
PS D:\APPS\Python\3.5> records --help
Traceback (most recent call last):
  File "d:\apps\python\3.5\lib\runpy.py", line 170, in _run_module_as_main
    "__main__", mod_spec)
  File "d:\apps\python\3.5\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "D:\APPS\Python\3.5\Scripts\records.exe\__main__.py", line 9, in <module>
  File "d:\apps\python\3.5\lib\site-packages\records.py", line 318, in cli
    arguments = docopt(cli_docs)
  File "d:\apps\python\3.5\lib\site-packages\docopt.py", line 575, in docopt
    extras(help, version, argv, doc)
  File "d:\apps\python\3.5\lib\site-packages\docopt.py", line 481, in extras
    print(doc.strip("\n"))
  File "d:\apps\python\3.5\lib\encodings\cp437.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2122' in position 23: character maps to <undefined>