Comments (4)
Because of the way this library leans on itertable
, this is trickier than i expected. So far I've got to subclass/replace just as many parts of itertable
as data_wizard
so I'm probably going to package my final fix up like the separate data_wizard.sources
app since otherwise it seems like I'd just be adding a lot of otherwise useless functionality to itertable
to support data_wizard
in order to get data_wizard
to support django-storages
.
from django-data-wizard.
I have integrated Django Data Wizard with S3 before, but that was with very large (multi-GB) CSV files and some heavy customization to parallelize the import across several Lambda workers. For the simple case with django-storages
, I think all that is needed is to get the file data from S3 into a BytesIO
on the file
attribute of a custom itertable
class. Perhaps something like this:
# myapp/wizard.py
from itertable import ExcelFileIter # or whichever format
from data_wizard.loaders import FileLoader
class ExcelS3Iter(ExcelFileIter):
def load(self):
response = s3.get_object(
Bucket=self.bucket,
Key=self.filename,
)
self.file = response['Body'] # I think this is already file-like
class S3Loader(FileLoader):
def load_iter(self):
return ExcelS3Iter(
bucket=settings.AWS_STORAGE_BUCKET_NAME,
filename=self.file.name,
)
# myproject/settings.py
DATA_WIZARD = {
'LOADER': 'myapp.wizard.S3Loader',
}
I believe no other customization of itertable
should be necessary. That said, itertable
exists primarily to support data_wizard
- so anything to make that support better is a valid contribution IMO. The only caveat is that itertable
shouldn't need to know anything about Django - so it could for example have an optional boto3
integration, but not django-storages
specifically.
The goal with having itertable
be its own library is to make it easier to test the file loading and parsing code in isolation, without worrying about the complexity introduced by data_wizard
's task runner.
from django-data-wizard.
Actually, I think a better approach in this specific case is to update itertable.load_file()
to accept arbitrary file-like objects (wq/itertable@5a47f32). Then in data_wizard
it's just a matter of passing the file object directly from the storage backend to itertable
(4a63066).
I haven't tested this with django-storages
specifically, but it should just work. If you would like to try it out before the next release, be sure to update both itertable
and data_wizard
to the latest development builds.
from django-data-wizard.
This fix has been released.
from django-data-wizard.
Related Issues (20)
- Import Number without float point HOT 2
- ImportError with DRF >= 3.8 HOT 2
- getting 401 Unauthorized when trying to run wizard HOT 5
- Cannot search Natural Key when use unique=True in a NaturalKeyModel HOT 3
- 'NoneType' object has no attribute 'run' HOT 1
- Upon installation admin interface error: Reverse for 'app_list' not found. 'app_list' is not a valid view function or pattern name. HOT 4
- update for wq.app 1.2.0
- Error 403 in GET /datawizard/3/status/ HOT 1
- DataError: integer out of range with Postgres HOT 1
- Custom Serializer questions HOT 1
- Debugging HOT 2
- Troubles importing datas when target model has many to many field HOT 1
- django.contrib.admin.sites.AlreadyRegistered: The model Run is already registered in app 'data_wizard'.
- Error Loading Template HOT 1
- ImportError: cannot import name 'MutableMapping' from 'collections' HOT 1
- Did anyone tried it with django 4.06 python 3.10 HOT 1
- No pyproject.toml (poetry) config file HOT 1
- No runtests.sh mentioned in contributing.md and `python -m django test` fails HOT 1
- Failing URL lookup HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from django-data-wizard.