Comments (3)
I see two options now, both breaking changes:
- Require more content about the file argument (e.g. with a string_content=["path","file"] argument, but that would only have a meaning for file arguments of type str (kinda ugly)
- As pandas.read_csv() does: Accept str |
io.StringIO
types for file and always infer the str type as a Path, and the io.StringIO as file content. This would require the user to convert a string to StringIO object
I like the second one. we can set the type of filepath_or_buffer
to Path | io.StringIO
. then we dont use strings and can use the Path.is_file() method.
from pygef.
In the current implementation, we just try to parse the possible contents of file
and if it fails we move on to the next. The last parsing option that fails (parsing as bro-xml by default) is the one that provides the error. This is random behaviour and is the reason that the error that is returned doesn't reflect the actual problem.
I see the following resolutions now:
-
Require more content about the file argument (e.g. with a string_content=["path","file"] argument, but that would only have a meaning for file arguments of type str (kinda ugly) (Breaking change)
-
As pandas.read_csv() does: Accept str |
io.StringIO
types for file and always infer the str type as a Path, and the io.StringIO as file content. This would require the user to convert a string to StringIO object. (Breaking change) -
Return a general custom exception (e.g. CPTParsingError) when all parsing options have failed. Although this will still return a
ValueError
for an erroneous gef-file path.
from pygef.
When a string has the form of a path separated by slashes or even a single short word it can never be a valid XML or GEF file right? All XML files need to start with a <
character, so that's easy, and all GEF files have the form of key: value
, so I think a proper heuristic would be:
Is almost certainly path if:
- Is the length <= 255 characters
- Does it end with case-insensitive
.gef
or.xml
- Does it not contain any
<
or:
characters - Can it be opened from disk
from pygef.
Related Issues (20)
- Import from gpkg HOT 1
- Issue with plotting with Robertson classification HOT 1
- GEF with single line data, can't be read HOT 5
- Add CPTData documentation HOT 1
- read_cpt should return dedicated exception if file is not gef or xml HOT 1
- update soil distribution
- matplotlib internal errors
- polars internal errors
- sort dataframe
- plot inverted
- Missing dependency 'typing_extensions' HOT 3
- import as file error (xml, CPT, Bore) HOT 3
- missing dependency typing_extensions
- exceptions.ComputeError: arithmetic on string and numeric not allowed, try an explicit cast first HOT 1
- Gef files not parsed with read_cpt function HOT 3
- read functions should throw error for unknown engine
- parsing error HOT 1
- Cannot parse GEF file HOT 1
- gef parsing failes due to tab delimiter
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pygef.