Comments (7)
This should definitely be fixed. The problem is, how to do it?
The "Open CSV" option loads the CSV into memory. That memory then gets wiped when the server restarts after the plugin is installed.
Three options I can think of:
- Keep track of the file paths to the CSVs that were opened, and re-open them when the server restarts. Downside here is what if the user has deleted or moved the CSV in the meantime?
- Figure out a way to dump the in-memory database to a temporary file and then re-open it again when the server restarts
- Switch
temporary
to be an on-disk database in a temporary directory as opposed to an in-memory database
from datasette-app.
Option 2 looks like it might be a good one, thanks to the VACUUM INTO
command. This isn't in older versions of SQLite but Datasette Desktop always has SQLite 3.36.0 thanks to https://github.com/indygreg/python-build-standalone/blob/1e58aaf6f1b82a32fabb4bd0ea6535af6afe2718/pythonbuild/downloads.py#L309-L319
Here's a proof of concept:
% ~/.datasette-app/venv/bin/python
Python 3.9.6 (default, Jul 24 2021, 22:49:46)
[Clang 12.0.1 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlite3
>>> conn = sqlite3.connect(":memory:")
>>> conn.execute("select sqlite_version()").fetchall()
[('3.36.0',)]
>>> conn.execute("create table foo (id integer primary key)")
<sqlite3.Cursor object at 0x10f5b1ce0>
>>> conn.execute("insert into foo (id) values (1)")
<sqlite3.Cursor object at 0x10f5b17a0>
>>> conn.execute("insert into foo (id) values (2)")
<sqlite3.Cursor object at 0x10f5b1ce0>
>>> conn.execute("insert into foo (id) values (3)")
<sqlite3.Cursor object at 0x10f5b17a0>
>>> conn.execute("select * from foo").fetchall()
[(1,), (2,), (3,)]
>>> conn.execute("vacuum into '/tmp/backup.db'")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
sqlite3.OperationalError: cannot VACUUM from within a transaction
>>> conn.isolation_level = None
>>> conn.execute("vacuum into '/tmp/backup.db'")
<sqlite3.Cursor object at 0x10f5b1c00>
Having run that I have a file in /tmp/backup.db
which looks like this:
~ % sqlite3 /tmp/backup.db .dump
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE foo (id integer primary key);
INSERT INTO foo VALUES(1);
INSERT INTO foo VALUES(2);
INSERT INTO foo VALUES(3);
COMMIT;
from datasette-app.
I found that fix for sqlite3.OperationalError: cannot VACUUM from within a transaction
in ghaering/pysqlite#109 (comment)
from datasette-app.
But... how would I restore that saved temporary database file back INTO the in-memory database? Could I use vacuum into
in the opposite direction?
from datasette-app.
https://www.sqlite.org/lang_vacuum.html says:
The argument to INTO can be a URI filename if URI filenames are enabled. URL filenames are enabled if any of the following are true:
- The SQLite library was compiled with -DSQLITE_USE_URI=1.
- The sqlite3_config(SQLITE_CONFIG_URI,1) interfaces was invoked at start-time.
- The database connection that is running the VACUUM INTO statement was originally opened using the SQLITE_OPEN_URI flag.
Datasette in https://github.com/simonw/datasette/blob/d57ab156b35ec642549fb69d08279850065027d2/datasette/database.py#L72-L79 does this:
if self.memory_name:
uri = "file:{}?mode=memory&cache=shared".format(self.memory_name)
conn = sqlite3.connect(
uri,
uri=True,
check_same_thread=False,
)
So it should be possible to use VACUUM INTO
to restore that backed up database file to the in-memory temporary
database.
from datasette-app.
Yes, this works:
>>> import sqlite3
>>> backup_db = sqlite3.connect("/tmp/backup.db", uri=True)
>>> conn = sqlite3.connect("file:temporary?mode=memory&cache=shared", uri=True)
>>> backup_db.execute("vacuum into 'file:temporary?mode=memory&cache=shared'")
<sqlite3.Cursor object at 0x11099af80>
>>> conn.execute("select * from foo").fetchall()
[(1,), (2,), (3,)]
The order mattered - I had to open conn
before I tried executing vacuum into
.
The uri=True
argument when opening backup_db
was also necessary.
from datasette-app.
This is going to need some new API endpoints in datasette-app-support
:
/-/dump-temporary-to-file
/-/restore-temporary-from-file
I'll leave it to the Electron app to define the file - I thought about using tempfile
in Python but I don't think that will survive the server restart.
from datasette-app.
Related Issues (20)
- Can't "go back" after choosing view as JSON or CSV and saving/storing queries in underlying SQLite
- Allow for load / open of data from URL where CSV (or JSON?) is compressed
- Allow handling of larger CSV files e.g. CFPB complaints data HOT 1
- Switch to Playwright for automated tests HOT 28
- Failing to render in retina resolution when running on my MacBook Pro (macOS Catalina) HOT 1
- Research faster launch times
- Mechanism for plugins that need a specific DB HOT 3
- "Loading..." hangs on launch after parsing "tag_name" in a plugin HOT 6
- npm test is failing HOT 2
- All sorts of features not working due to bad HTTP requests HOT 9
- London Fire Brigade demo breaks HOT 7
- Import CSV from URL should show errors
- New Playwright tests should exercise key features
- Ability to use templates
- The icon looks blurry in the alt-tab window
- Loading... hangs on launch with packaging.version.InvalidVersion HOT 6
- "About Datasette" button fails to open information window HOT 2
- Unable to load preload script HOT 14
- Global Power Plant database is not found. HOT 1
- plugin menu throws '500' error
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from datasette-app.