koteiito / node-athena Goto Github PK
View Code? Open in Web Editor NEWa nodejs simple aws athena client
License: MIT License
a nodejs simple aws athena client
License: MIT License
Hello,
First I would like to ask a question, does the createClient
function returns a new AWS connection every time we call it?
Do you think having a singleton createClient function would be nice? If so, should I create a pull request with a method called createClientSingleton
for this purpose?
Thanks,
Ugurcan
Hi,
Good day.
What IAM roles are required by the package? Thanks.
Regards.
JJ
This problem appears approximately every time I use execute(sql).toPromise()
After debugging I noticed that error throws in this place
node-athena/build/lib/stream.js
Line 66 in a3f84b9
TypeError: Cannot read property 'length' of undefined
at AthenaStream. (/var/task/node_modules/athena-client/build/lib/stream.js:93:40)
at next (native)
at fulfilled (/var/task/node_modules/athena-client/build/lib/stream.js:4:58)
The version of athena-client
is 2.0.0
I'm trying to run the example:
var clientConfig = {
bucketUri: 's3://xxxx'
}
var awsConfig = {
region: 'xxxx',
}
var athena = require("athena-client")
var client = athena.createClient(clientConfig, awsConfig)
client.execute('SELECT 1').toPromise()
.then(function(data) {
console.log(data)
})
.catch(function(err) {
console.error(err)
})
But I'm getting the following error:
Unexpected key 'WorkGroup' found in params
Any idea why?
Hi,
Getting Athena is not a construction error while using browserify. It looks like Athena is not included in the brower sdk. Could you please suggest how to resolve this issue?
When I got the records to the query of 'SHOW TABLES IN hoge'
hoge
fuga
piyo
I got the rows in node-athena.
[ Row { hoge: 'fuga' },
Row { hoge: 'piyo' }]
The sample code I used is here.
const client = createClient(
{ bucketUri: 's3://bucket_path' },
{ region: 'ap-northeast-1' }
)
const tables = await client
.execute(`SHOW TABLES IN hoge`)
.toPromise()
console.log(tables.records)
Hi,
Thanks for putting this library together, as the recommended library on AWS' own docs (athena-express) has no built in support for streaming results.
One issue I'm running into is that when streaming results, the operations I'm performing on them are significantly slower than the rate at which the data is being streamed in. The data then buffers until I've exceeded the maximum number of allowed memory mapped locations (at an operating system level). Is there a way using this library at the moment to be able to restrict how quickly the data is being streamed in, based on the number of data
events that have been successfully processed?
I would want to get query results after I have created a CTAS table with skipFetchResult = true.
I tried creating a new client with a different client config, excluding the skipFetchResult setting.
However, after using the new client I still do not get any results.
Great library! I've been using it to execute and capture the results of a variety of Athena SQL commands. They all work apart from the CREATE TABLE AS.
When I execute the following I get a NoSuchKey: The specified key does not exist.
error.
const query = 'CREATE TABLE newtable WITH (format='ORC') SELECT * from rawtable';
athena-client
.execute(query)
.toPromise()
.then(result => {})
.catch(error => {})
I am hitting a problem during fetching of my results.
When the dataset have a comma (,) in the row, the buffer split incorrectly split on the comma with catering for the escaping double quotes (").
I'd like to stream results into my local node program. However, results also get saved into a csv on a S3 bucket. Can the latter be made optional?
Does this cleanup the bucket after the results are returned? If not Feature request!
Hi, on trying code I am getting the following error - createClient is not a function.
Hi,
I'm executing simple query from Readme.md file
await client.execute('SELECT 1').toPromise()
.then((data) => {
console.log(data)
})
And I get array with Type definition: records: [ Row { _col0: '1' } ],
.
This wasn't mentioned in the docs. How can I parse such thing in JavaScript?
{ records: [ Row { _col0: '1' } ],
queryExecution:
{ QueryExecutionId: '1379ac84-4b2f-4629-a708-5eafd594d725',
Query: 'SELECT 1',
ResultConfiguration:
{ OutputLocation: 's3://........bucket/1379ac84-4b2f-4629-a708-5eafd594d725.csv' },
QueryExecutionContext: { Database: 'default' },
Status:
{ State: 'SUCCEEDED',
SubmissionDateTime: 2018-08-23T12:12:52.706Z,
CompletionDateTime: 2018-08-23T12:12:54.296Z },
Statistics: { EngineExecutionTimeInMillis: 1285, DataScannedInBytes: 0 } } }
Hey, I would like to ask for the addition of the query id to the return value of the 'execute' method.
That will enable us to get the query results from Athena without re-querying (And will greatly reduce the bills).
Hi,
Trying to use this client with angular project and running into several issues.
const { athena } = require('athena-client');
@Injectable({
providedIn: 'root'
})
export class AthenaTestGqService {
clientConfig = {
bucketUri: constants.aws.athena.tempLocation
};
awsConfig = {
region: constants.aws.region
};
constructor() {
const client = athena.createClient(this.clientConfig, this.awsConfig);
client.execute('SELECT * FROM my_table LIMIT 20', function (err, data) {
if (err) {
console.log(err);
}
console.log(data);
});
}
}
The error this produces is
TypeError: Cannot read property 'createClient' of undefined
try to execute this code:
return athenaUtil.createClient({ bucketUri: 'MY S3 OUTPUT' }, { region: process.env.AWS_REGION }).execute('BAD SQL STRING').toStream();
it will throw "UnhandledPromiseRejectionWarning: InvalidRequestException: line 1:1: mismatched input 'BAD' expecting"
Have an error: "Athena is not a constructor". How to fix it?
When I do create client it says
aws.Athena is not a constructor
When using this package for querying my athena database, it keep creating junk CSV files and add them into my bucket.
how can i prevent that from happening ?
Hi, I was just wondering which versions of node this is compatible with because AWS Lambda uses Node v6. I want to make sure that I can use this with it.
Is there any way to get support for proxy?
https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/node-configuring-proxies.html
Hi, I wanted to thank for your work. Querying Athena from JS can be cumbersome and your library really helps.
We have encountered an issue with parsing results from Athena queries. In particular, if a Glue column is marked as boolean
, the information is lost in translation when parsing the .csv
results file.
Typing information is stored in a nearby .csv.metadata
file, see:
https://docs.aws.amazon.com/athena/latest/ug/querying.html
I don't think the library is parsing any of the typings when reading the results from the CSV, which is mostly ok in JS. Strings are still strings, numbers are properly casted. But Boolean("false") === true
and this behaviour can cause very hard-to-track bugs (like the one that lead me here).
I am opening the issue in case you are not aware of the issue.
Cheers
Hi,
Good day.
Is there anyway to pass the QueryExecutionId to get previously executed result set? Thanks.
Regards.
JJ
I spent a lot of time trying to find a problem in my code but the problem was hidden from the eyes, and it lives on this line of code https://github.com/KoteiIto/node-athena/blob/master/src/index.ts#L42
I am using AWS STS to be able to access another AWS profile, and run a request with temporary credentials that are passed to the createClient
function and they are set globally after the function call, and it breaks the rest of the AWS client calls (like S3, where default server settings must be used) returning in the response Access Denied
Thank you for providing this handy library!
I notice that you keep a local counter of calls, and will queue queries when the concurrency limit is hit.
Just curious why? Athena has its own way of queueing, and would not fail the queries either.
Thanks!
Yang
Hi I'd like to use this library with the serverless project (https://serverless.com/framework/) is there any samples on how to make this work, or is it not in the scope of this project? Thanks for the great code!
Hello,
First of all, big thanks for your great work on that lib, we are using it and it works like a charm.
AWS has released an update a few weeks back for what they call CTAS queries:
https://docs.aws.amazon.com/athena/latest/ug/ctas.html
It offers great features, however, we hit a glitch using node-athena on that kind of query.
The "outputLocation" received in the response is not pointing to the csv result or the metadata file.
Instead, the response is s3://[bucket_name]/[path]/[query_id] without the file extension.
When running this kind of queries with node-athena, we hit an error as the s3 key does not exists.
We have been able to patch the lib so it can handle this kind of query.
We have a compiled working version available at:
https://github.com/jubry/node-athena
I'm not a coder, so i'm not pretending my patch is how we should handle things.
I just changed a few lines of code in one file (client.js):
export interface AthenaClientConfig extends AthenaRequestConfig {
pollingInterval?: number
queryTimeout?: number
concurrentExecMax?: number
execRightCheckInterval?: number
noResultExpected?: boolean
}
NB: noResultExpected?: boolean added
if (!config.noResultExpected) {
const resultsStream = this.request.getResultsStream(
queryExecution.ResultConfiguration.OutputLocation,
)
resultsStream.pipe(csvTransform)
}
NB: if (!config.noResultExpected) { added
By doing that, we are able to keep using your library.
Would it be possible for you to implement such functionality in a proper way ?
It would be greatly appreciated.
Find here an sql statement to reproduce the problem:
CREATE TABLE database.table WITH
( format = 'PARQUET',
external_location = 's3://mybucket_name/')
AS SELECT 1 as myfield
Let me know if you have any question about it.
Thanks in advance,
And again, congratulation for your great work.
Regards,
:)
What are the minimum required permissions to use this module?
Also, is it possible to use the role that is running the module to be inferred as the accessKeyId and secretAccessKey instead of having to hardcode those values into the code being used to run the query?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.