Giter Site home page Giter Site logo

php-thrift-sql's Introduction

PHP ThriftSQL

The ThriftSQL.phar archive aims to provide access to SQL-on-Hadoop frameworks for PHP. It bundles Thrift and various service packages together and exposes a common interface for running queries over the various frameworks.

Currently the following engines are supported:

  • Hive -- Over the HiveServer2 Thrift interface, SASL is enabled by default so username and password must be provided however this can be turned off with the setSasl() method before calling connect().
  • Impala -- Over the Impala Service Thrift interface which extends the Beeswax protocol.

Version Compatibility

This library is currently compiled against the Thrift definitions of the following database versions:

Using the compiler and base PHP classes of:

Usage Example

The recommended way to use this library is to get results from Hive/Impala via the memory efficient iterator which will keep the connection open and scroll through the results a couple rows at a time. This allows the processing of large result datasets one record at a time minimizing PHP's memory consumption.

// Load this lib
require_once __DIR__ . '/ThriftSQL.phar';

// Try out a Hive query via iterator object
$hive = new \ThriftSQL\Hive( 'hive.host.local', 10000, 'user', 'pass' );
$hiveTables = $hive
  ->connect()
  ->getIterator( 'SHOW TABLES' );

// Try out an Impala query via iterator object
$impala = new \ThriftSQL\Impala( 'impala.host.local' );
$impalaTables = $impala
  ->connect()
  ->setOption( 'MEM_LIMIT', '2gb' ) // optionally set some query options
  ->getIterator( 'SHOW TABLES' );

// Execute the Hive query and iterate over the result set
foreach( $hiveTables as $rowNum => $row ) {
  print_r( $row );
}

// Execute the Impala query and iterate over the result set
foreach( $impalaTables as $rowNum => $row ) {
  print_r( $row );
}

// Don't forget to close socket connection once you're done with it
$hive->disconnect();
$impala->disconnect();

The downside to using the memory efficient iterator is that we can only iterate over the result set once. If a second foreach is called on the same iterator object an exception is thrown by default to prevent the same query from executing on Hive/Impala again as results are not cached within the PHP client. This can be turned off however be aware iterating over the same iterator object may produce different results as the query is rerun.

Consider the following example:

// Connect to hive and get a rerun-able iterator
$hive = new \ThriftSQL\Hive( 'hive.host.local', 10000, 'user', 'pass' );
$results = $hive
  ->connect()
  ->getIterator( 'SELECT UNIX_TIMESTAMP()' )
  ->allowRerun( true );

// Execute the Hive query and get results
foreach( $results as $rowNum => $row ) {
  echo "Hive server time is: {$v[0]}\n";
}

sleep(3);

// Execute the Hive query a second time
foreach( $results as $rowNum => $row ) {
  echo "Hive server time is: {$v[0]}\n";
}

Which will output something like:

Hive server time is: 1517875200
Hive server time is: 1517875203

If the result set is small and it would be easier to load all of it into PHP memory the queryAndFetchAll() method can be used which will return a plain numeric multidimensional array of the full result set.

// Try out a small Hive query
$hive = new \ThriftSQL\Hive( 'hive.host.local', 10000, 'user', 'pass' );
$hiveTables = $hive
  ->connect()
  ->queryAndFetchAll( 'SHOW TABLES' );
$hive->disconnect();

// Print out the cached results
print_r( $hiveTables );
// Try out a small Impala query
$impala = new \ThriftSQL\Impala( 'impala.host.local' );
$impalaTables = $impala
  ->connect()
  ->queryAndFetchAll( 'SHOW TABLES' );
$impala->disconnect();

// Print out the cached results
print_r( $impalaTables );

Developing & Contributing

In order to rebuild this library you will need Composer to install dev dependencies and Apache Thrift to compile client libraries from the Thrift interface definition files.

Once dev tools are installed, make sure you get all git submodules:

$ git submodule init

And then the phar can be rebuilt using make:

$ make clean && make phar

NOTE: If you get a BadMethodCallException, it may come from any of the reasons mentioned in the PHP doc, or even a low soft limit on open file descriptors since Phar::compressfiles keeps all files opened until it writes the compressed phar.

php-thrift-sql's People

Contributors

bperson avatar coreh avatar joostshao avatar withinboredom avatar xyu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

php-thrift-sql's Issues

Time out problem

I use this repository for php to SQL query hiveserver2. But one SQL query task may takes 20 minutes. So there is a time out problem. Can I set the time out limitation and if possible, how to change it?Thanks.

Invalid method name: 'query'

用Impala连接查询报错
`
$impala = new \ThriftSQL\Impala('....**, 21055, '1234');

$impalaTables = $impala->connect()->query('show tables');

print_f($impalaTables);
`
报错信息:
Thrift \ Exception \ TApplicationException (1) Invalid method name: 'query'

I look forward to your help!

Can't connect to Hive using a SSL connection

Anyone can help me?
I can't find a way to connect to Hive using a SSL connection.

I have connected and we have response and results from Hadoop using a non SSL connection, but we don't have an option to add this SSL certificates to Hive authentication.

This library support SSL? Or any alternative for this?

Thank you!

Exception only because there is no match rows

I got an time out exception when there is no rows macthed

select * from dbname.tablename limit 10 // ok

select * from dbname.tablename where metricname='x' limit 10 //no match rows

I got the flowing exception message:

Fatal error: Uncaught Thrift\Exception\TTransportException: TSocket: timed out reading 4 bytes from 192.168.2.32:21000:21000 in phar://php-thrift-sql/ThriftSQL.phar/Thrift/Transport/TSocket.php:274 Stack trace: #0 phar://php-thrift-sql/ThriftSQL.phar/Thrift/Transport/TTransport.php(74): Thrift\Transport\TSocket->read(4) #1 phar://php-thrift-sql/ThriftSQL.phar/Thrift/Protocol/TBinaryProtocol.php(305): Thrift\Transport\TTransport->readAll(4) #2 phar://php-thrift-sql/ThriftSQL.phar/Thrift/Protocol/TBinaryProtocol.php(197): Thrift\Protocol\TBinaryProtocol->readI32(NULL) #3 phar://php-thrift-sql/ThriftSQL.phar/Packages/Beeswax/BeeswaxService.php(325): Thrift\Protocol\TBinaryProtocol->readMessageBegin(NULL, 0, 0) #4 phar://php-thrift-sql/ThriftSQL.phar/Packages/Beeswax/BeeswaxService.php(292): ThriftSQL\BeeswaxServiceClient->recv_fetch() #5 phar://php-thrift-sql/ThriftS in phar://php-thrift-sql/ThriftSQL.phar/Thrift/Transport/TSocket.php on line 274

can not connect to hiveserver2

I can connect to hiveserver2 in beeline whith beeline> !connect jdbc:hive2://192.168.1.110:10010
but cannot conncet to hiveserver2 whith code blow (there are no username and password):
$hive = new \ThriftSQL\Hive( '192.168.1.110',10010); $hiveTables = $hive ->setSasl( false ) /* To turn SASL auth off, on by default */ ->connect() ->queryAndFetchAll( 'SHOW TABLES' ); print_r( $hiveTables );

and I got the error :
PHP Fatal error: Uncaught Thrift\Exception\TException: TSocket: Could not connect to 192.168.1.110:10010 (Connection refused [111]) in phar:///home/lixiyu/code/php-thrift-sql/ThriftSQL.phar/Thrift/Transport/TSocket.php:235 Stack trace: #0 phar:///home/lixiyu/code/php-thrift-sql/ThriftSQL.phar/ThriftSQL/Hive.php(47): Thrift\Transport\TSocket->open() #1 /home/lixiyu/code/php-thrift-sql/example.php(10): ThriftSQL\Hive->connect() #2 {main} thrown in phar:///home/lixiyu/code/php-thrift-sql/ThriftSQL.phar/Thrift/Transport/TSocket.php on line 235

can somebody help me ?

Add more error detection

If a query gets sent that can't be properly compiled we end up getting a Required field 'operationHandle' is unset! we should be checking for these types of errors earlier.

Props @coreh

你好,我有个问题请教下

连接服务器的时候,返回
Thrift\Exception\TTransportException:
”TSocket read 0 bytes“
不知道是哪里出了问题,非常感谢你的帮助!

make phar error

error

[root@qitv2302 bb]# make phar
git submodule update

Build error codes thrift file

src-thrift/impala/common/thrift/generate_error_codes.py
: No such file or directory
make: *** [impala] Error 127
[root@qitv2302 bb]# ls -a src-thrift/impala/common/thrift/generate_error_codes.py
-rwxr-xr-x 1 root 22020 May 16 19:43 src-thrift/impala/common/thrift/generate_error_codes.py

file is haved

TSocket read 0 bytes

Exception: TSocket read 0 bytes

try {
// Try out a Hive query via iterator object
$hive = new \ThriftSQL\Hive('10.10.0.1', 10000, null, null, 3600 * 24);
$hiveTables = $hive->connect()->queryAndFetchAll($hive_sql);
...
} catch (\Exception $ex) {
$message = $ex->getMessage();
}

Composer install phar not found during deployment.

The Path in autoload_static is set to a non-relative phar location. This gives problems when deploying the code to a server where the absolute location can differ from the original one.

public static $files = array (
'2cffec82183ee1cea088009cef9a6fc3' => DIR . '/..' . '/ezyang/htmlpurifier/library/HTMLPurifier.composer.php',
'c972dff9f16ac280b1304f543e741e8a' => DIR . '/..' . '/greenlion/php-sql-parser/src/PHPSQLParser/PHPSQLCreator.php',
'd6cde5c960e16951227fd8cd2000d80a' => DIR . '/..' . '/greenlion/php-sql-parser/src/PHPSQLParser/PHPSQLParser.php',
'5eb9c05cf55211ba85c38aa136f5e17e' => 'phar://C:\Users\User1\Documents\PhpstormProjects\querotails\vendor/automattic/php-thrift-sql/ThriftSQL.phar',
);

(Please help) Invalid method name: 'query'

When I connect impala,The result is "Thrift \ Exception \ TApplicationException (1) Invalid method name: 'query'"
The error line show:( 77. $x = new TApplicationException();)
image
My code:
$obj->connect()->queryAndFetchAll($sql)

I look forward to your help!

PHP 7.1 compatibility

Hi,

we would like to use this library in our project and wonder if the library itself and bundled 3rd party libraries are compatible with PHP 7.1.

Do you have any experiences specific to running the library against PHP >= 7.1 ?

We appreciate your work very much and thank you in advance for your response.

Kind regards,

Markus Wößner

ADITION technologies
Freiburg, Germany

Compatibility with PHP 8.1 - \iterator update

Deprecated: Return type of ThriftSQL\Utils\Iterator::next() should either be compatible with Iterator::next(): void, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in phar:.../ThriftSQL.phar/ThriftSQL/Utils/Iterator.php on line 52

Tasks

No tasks being tracked yet.

Insert query gives error

Hi,
first of all thanks a lot for sharing this repository. It worked flawlessly. I am able to connect and run few sql like show tables, show databases and select sqls. I am trying to do Insert operation now.
I am using Ambari and HDP 2.4 (Hive 1.2) and I have enabled the ACID property in order to enable the insert, update and delete.

While running Insert sql it is gicing me following error

$sql = "INSERT INTO TABLE mydb.variables_new2 (id, a_cost, b_cost) VALUES (2, 10, 20)";
$hiveTables = $hive
->setSasl( true )
->connect()
->query( $sql );
print_r( $hiveTables );

Output is:
ThriftSQL\HiveQuery Object ( [resp:ThriftSQL\HiveQuery:private] => ThriftSQL\TExecuteStatementResp Object ( [status] => ThriftSQL\TStatus Object ( [statusCode] => 0 [infoMessages] => [sqlState] => [errorCode] => [errorMessage] => ) [operationHandle] => ThriftSQL\TOperationHandle Object ( [operationId] => ThriftSQL\THandleIdentifier Object ( [guid] => [��R!CN�����},�� [secret] => �_�ڝ�@̲�:S���� ) [operationType] => 0 [hasResultSet] => [modifiedRowCount] => ) ) [client:ThriftSQL\HiveQuery:private] => ThriftSQL\TCLIServiceClient Object ( [input:protected] => Thrift\Protocol\TBinaryProtocol Object ( [strictRead:protected] => [strictWrite:protected] => 1 [trans_:protected] => Thrift\Transport\TSaslClientTransport Object ( [transport_:Thrift\Transport\TSaslClientTransport:private] => Thrift\Transport\TSocket Object ( [handle_:Thrift\Transport\TSocket:private] => Resource id #25 [host_:protected] => 172.16.1.150 [port_:protected] => 10000 [sendTimeoutSec_:Thrift\Transport\TSocket:private] => 0 [sendTimeoutUsec_:Thrift\Transport\TSocket:private] => 100000 [recvTimeoutSec_:Thrift\Transport\TSocket:private] => 0 [recvTimeoutUsec_:Thrift\Transport\TSocket:private] => 750000 [persist_:protected] => [debug_:protected] => [debugHandler_:protected] => error_log ) [username_:protected] => hive [password_:protected] => hive [saslComplete_:Thrift\Transport\TSaslClientTransport:private] => 1 [readBuffer_:Thrift\Transport\TSaslClientTransport:private] => [writeBuffer_:Thrift\Transport\TSaslClientTransport:private] => ) ) [output_:protected] => Thrift\Protocol\TBinaryProtocol Object ( [strictRead_:protected] => [strictWrite_:protected] => 1 [trans_:protected] => Thrift\Transport\TSaslClientTransport Object ( [transport_:Thrift\Transport\TSaslClientTransport:private] => Thrift\Transport\TSocket Object ( [handle_:Thrift\Transport\TSocket:private] => Resource id #25 [host_:protected] => 172.16.1.150 [port_:protected] => 10000 [sendTimeoutSec_:Thrift\Transport\TSocket:private] => 0 [sendTimeoutUsec_:Thrift\Transport\TSocket:private] => 100000 [recvTimeoutSec_:Thrift\Transport\TSocket:private] => 0 [recvTimeoutUsec_:Thrift\Transport\TSocket:private] => 750000 [persist_:protected] => [debug_:protected] => [debugHandler_:protected] => error_log ) [username_:protected] => hive [password_:protected] => hive [saslComplete_:Thrift\Transport\TSaslClientTransport:private] => 1 [readBuffer_:Thrift\Transport\TSaslClientTransport:private] => [writeBuffer_:Thrift\Transport\TSaslClientTransport:private] => ) ) [seqid_:protected] => 0 ) [ready:ThriftSQL\HiveQuery:private] => )

I have 1 question, the function queryAndFetchAll() worked for Select sqls. Shall I use the same function to run Insert, update and delete command or I have to use any other function.
Where I can see list of all functions that you have created in this library(phar).

I am trying since last week to make it work.

Please help.

Thanks in advance,
Paresh Kendre ([email protected])

'Required field 'operationHandle' is unset!

Hi,

I try to use your implementation for connecting to a hive cluster, which works as expected with the given sample. But when I try to retrieve anything from hive that needs some map/reduce jobs run, I've got the following exception:
'Thrift\Exception\TApplicationException' with message 'Required field 'operationHandle' is unset!
Can you help me where should I set this?

Thx in advance

Want a way to get the query result schema information

Hello,
I'd tried this lib last week and it working well with Impala 2.x server. But I can't find a way to get the query result schema information (field name, data type, etc...).
Is there anyway to get it?
This information is useful when I build a tool to allow consumer execute their SQL over impala.
Thanks.

can't connect impala,and the log of impala server show:invalid sasl status

i try connect the impala,but is error:
client error:
PHP Fatal error: Uncaught exception 'Thrift\Exception\TTransportException' with message 'TSocket read 0 bytes' in phar:///home/map/rd/liuxiaochun/php-thrift-sql-master/ThriftSQL.phar/Thrift/Transport/TSocket.php:269
Stack trace:
#0 phar:///home/map/rd/liuxiaochun/php-thrift-sql-master/ThriftSQL.phar/Thrift/Transport/TTransport.php(74): Thrift\Transport\TSocket->read(4)
#1 phar:///home/map/rd/liuxiaochun/php-thrift-sql-master/ThriftSQL.phar/Thrift/Protocol/TBinaryProtocol.php(305): Thrift\Transport\TTransport->readAll(4)
#2 phar:///home/map/rd/liuxiaochun/php-thrift-sql-master/ThriftSQL.phar/Thrift/Protocol/TBinaryProtocol.php(197): Thrift\Protocol\TBinaryProtocol->readI32(NULL)
#3 phar:///home/map/rd/liuxiaochun/php-thrift-sql-master/ThriftSQL.phar/Packages/Beeswax/BeeswaxService.php(160): Thrift\Protocol\TBinaryProtocol->readMessageBegin(NULL, 0, 0)
#4 phar:///home/map/rd/liuxiaochun/php-thrift-sql-master/ThriftSQL.phar/Packages/Beeswax/BeeswaxService.php(129): ThriftSQL\BeeswaxServiceClient->recv_query()

server error:
TThreadPoolServer: TServerTransport died on accept: invalid sasl status

你好,我执行example.php的时候一直报错Could not connect to,求助!

我改了example.php里的impala连接配置,php example.php时报错(x.x.x.x代替我真实的impala配置):
Fatal error: Uncaught Thrift\Exception\TException: TSocket: Could not connect to x.x.x.x:21050 (Operation timed out [60]) in phar:///Users/xxx/Code/www/php-thrift-sql/ThriftSQL.phar/Thrift/Transport/TSocket.php:250

example.php里的代码只保留了impala:

connect() ->setOption( 'REQUEST_POOL', 'php' ) ->setOption( 'MEM_LIMIT', '256mb' ) ->getIterator( 'SHOW TABLES' ); foreach( $impalaTables as $rowNum => $row ) { print_r( $row ); } $impala->disconnect(); 希望得到您的帮助,非常感谢~~~

Required field 'client_protocol' unset error

Hey,
I run the example, but receiving "Required field 'client_protocol' unset error" error.

require_once DIR .'/../../libraries/php-thrift-sql/src/autoload.php';

// Try out a Hive query
$hive = new \ThriftSQL\Hive( 'xxx.com', 10000, 'my_user', my_password' );
$hiveTables = $hive
->connect()
->queryAndFetchAll( 'SHOW TABLES' );
print_r( $hiveTables );

// Don't forget to clear the client and close socket.
$hive->disconnect();

Returned:

PHP Fatal error: Uncaught exception 'Thrift\Exception\TApplicationException' with message 'Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null, username:biapp, password:5Pusecup)' in /biapps/hadoop/libraries/php-thrift-sql/src/Packages/TCLIService/TCLIService.php:164
Stack trace:

0 /biapps/hadoop/libraries/php-thrift-sql/src/Packages/TCLIService/TCLIService.php(131): ThriftSQL\TCLIServiceClient->recv_OpenSession()

1 /biapps/hadoop/libraries/php-thrift-sql/src/ThriftSQL/Hive.php(70): ThriftSQL\TCLIServiceClient->OpenSession(Object(ThriftSQL\TOpenSessionReq))

2 /biapps/hadoop/importers/hive_test/test.php(8): ThriftSQL\Hive->connect()

3 {main}

thrown in /biapps/hadoop/libraries/php-thrift-sql/src/Packages/TCLIService/TCLIService.php on line 164

Only support 'default' database?

Thank you for the great work.
But I wonder how to set database name in the query. I tried prepend 'use xxxx;' in the query string, but it throws

Fatal error:  Uncaught exception 'Thrift\Exception\TApplicationException' with message 'Required field 'operationHandle' is unset! Struct:TGetOperationStatusReq(operationHandle:null)' in phar:///data/web/kop-statistics/vendor/automattic/php-thrift-sql/ThriftSQL.phar/Packages/TCLIService/TCLIService.php:725

Other statments except selecting database works well.

Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.