Comments (2)
tl;dr: This doesn't answer your question, but it's a long explanation of why this feature (probably) doesn't exist. And also, why the behavior you've noticed ("001" being interpreted as a string and not a number, by an importing program) is just a quirk specific to that program – it's not CSV standard behavior.
FWIW, how a CSV file is interpreted – particularly, how to typecast each column, is generally up to the interpreting program. For example, importing your latter example (i..e. with quoted COUNTYFP10
values) into a pandas.DataFrame
will result in the same as importing the non-quoted data: the COUNTYFP10
column will be interpreted as integers:
>>> import pandas as pd
>>> df = pd.read_csv('/tmp/thedata.csv')
>>> print(df)
name COUNTYFP10
0 Adair 1
1 Andrew 3
2 Atchison 5
>>> print(df.dtypes)
name object
COUNTYFP10 int64
dtype: object
Not sure how Agate handles the import, just using Pandas as an example of another client program that has its own ways of auto-guessing the data types of imported CSVs.
The upshot of all this: I'd be really surprised if Agate or csvkit had a way to specify quoting by column when exporting to CSV, because there wouldn't be any point. There's just no accepted standard for how to assume datatypes of imported CSV, because CSV only has an understanding of every value being text...as opposed to something like JSON, in which there's a concept of strings, numbers, and booleans.
My advice is that you should accept having to configure datatyping on the importing program's side, e.g. in Excel's Text Import Wizard.
Or in Python pandas, it would be to set the dtype argument, e.g.
>>> df = pd.read_csv('/tmp/thedata.csv', dtype={'COUNTYFP10': str})
>>> print(df)
name COUNTYFP10
0 Adair 001
1 Andrew 003
2 Atchison 005
>>> print(df.dtypes)
name object
COUNTYFP10 object
dtype: object
It's a pain in the ass, but this kind of vaguery is just inherent to the CSV format. You don't want to depend on application-specific quirks when figuring out the import-export workflow.
from agate.
to_csv
passes through keyword arguments to Python's csv
module. You can import csv
and pass quoting=csv.QUOTE_ALL
or maybe even quoting=csv.QUOTE_MINIMAL
to to_csv
to achieve what you want. https://docs.python.org/3/library/csv.html#csv.QUOTE_ALL
from agate.
Related Issues (20)
- Dependency bug in Parsedatetime v2.5 causing Agate / CSVKit issues HOT 1
- TestSniffer.test_sniffer fails with newer Python3 HOT 2
- Ah ha! It looks like you named your script `agate.py` which is also the name of the library, so instead of importing agate it's importing your own code. Try renaming your script! HOT 2
- `print_table` should handle embedded newlines HOT 3
- fails to install with pip in cloud envs due to transitive system dep HOT 1
- PyICU dependency causing pip upgrade failure on macOS HOT 9
- Using homogenize() after denormalize() results in some rows without row_names HOT 5
- CI: Investigate intermittent test_sniffer error
- best way to convert Date cols to Text after loading? HOT 2
- best way to UNION two tables? HOT 1
- Update more aggregations to work with TimeDelta, specifically Median
- Copying to clipboard HOT 1
- Default string output of TableSet with multiple layers of nesting throws `AttributeError: 'TableSet' object has no attribute 'rows'` HOT 4
- agate requires parsedatetime != 2.5 but won't allow 2.6 HOT 3
- "/" separator for flat JSON files could not be unique HOT 7
- Please add support for python 3.10 HOT 4
- Feature request: line wrapping HOT 1
- Methods missing on some doc pages HOT 1
- Calculating mean for columns, ignoring non-numerical values HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from agate.