spark sql examples on kubernetes
submit sqlto thriftserver by beelinerun thriftserver in a podsh sbin/start-thriftserver.sh \--master k8s://https://kubernetes.default.svc.cluster.local:443 \--name spark-thriftserver \...
·
submit sql to thriftserver by beeline
run thriftserver in a pod
sh sbin/start-thriftserver.sh \
--master k8s://https://kubernetes.default.svc.cluster.local:443 \
--name spark-thriftserver \
--conf spark.executor.instances=1 \
--conf spark.kubernetes.driver.pod.name=sparkthrift-pod \
--conf spark.driver.host=sparkthrift-headless.default.svc.cluster.local \
--conf spark.driver.port=1888 \
--conf spark.kubernetes.container.image=zhixingheyitian/spark:spark2.4.1
submit query
The IP of the pod where thriftserver are runing is 172.17.0.13.
Thriftserver 's default port is 10000
# ./bin/beeline -u jdbc:hive2://172.17.0.13:10000
Connecting to jdbc:hive2://172.17.0.13:10000
log4j:WARN No appenders could be found for logger (org.apache.hive.jdbc.Utils).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Connected to: Spark SQL (version 2.4.1)
Driver: Hive JDBC (version 1.2.1.spark2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1.spark2 by Apache Hive
0: jdbc:hive2://172.17.0.13:10000> SELECT * FROM tpcds_1g_parquet.store_sales limit 10;
+------------------+------------------+-------------+-----------------+--------------+--------------+-------------+--------------+--------------+-------------------+--------------+--------------------+----------------+-----------------+----------------------+---------------------+------------------------+--------------------+-------------+----------------+--------------+----------------------+----------------+--+
| ss_sold_date_sk | ss_sold_time_sk | ss_item_sk | ss_customer_sk | ss_cdemo_sk | ss_hdemo_sk | ss_addr_sk | ss_store_sk | ss_promo_sk | ss_ticket_number | ss_quantity | ss_wholesale_cost | ss_list_price | ss_sales_price | ss_ext_discount_amt | ss_ext_sales_price | ss_ext_wholesale_cost | ss_ext_list_price | ss_ext_tax | ss_coupon_amt | ss_net_paid | ss_net_paid_inc_tax | ss_net_profit |
+------------------+------------------+-------------+-----------------+--------------+--------------+-------------+--------------+--------------+-------------------+--------------+--------------------+----------------+-----------------+----------------------+---------------------+------------------------+--------------------+-------------+----------------+--------------+----------------------+----------------+--+
| 2451897 | 50904 | 5695 | 38271 | 442082 | 2693 | 46104 | 4 | 26 | 1 | 91 | 47.2 | 65.13 | 31.26 | 0.0 | 2844.66 | 4295.2 | 5926.83 | 170.67 | 0.0 | 2844.66 | 3015.33 | -1450.54 |
| 2451897 | 50904 | 1871 | 38271 | 442082 | 2693 | 46104 | 4 | 263 | 1 | 75 | 83.79 | 93.84 | 3.75 | 0.0 | 281.25 | 6284.25 | 7038.0 | 14.06 | 0.0 | 281.25 | 295.31 | -6003.0 |
| 2451897 | 50904 | 3533 | 38271 | 442082 | 2693 | 46104 | 4 | 179 | 1 | 15 | 90.79 | 128.01 | 24.32 | 240.76 | 364.8 | 1361.85 | 1920.15 | 8.68 | 240.76 | 124.04 | 132.72 | -1237.81 |
| 2451897 | 50904 | 14989 | 38271 | 442082 | 2693 | 46104 | 4 | 89 | 1 | 33 | 28.98 | 40.57 | 23.53 | 0.0 | 776.49 | 956.34 | 1338.81 | 7.76 | 0.0 | 776.49 | 784.25 | -179.85 |
| 2451897 | 50904 | 7928 | 38271 | 442082 | 2693 | 46104 | 4 | 30 | 1 | 14 | 87.61 | 145.43 | 84.34 | 0.0 | 1180.76 | 1226.54 | 2036.02 | 11.8 | 0.0 | 1180.76 | 1192.56 | -45.78 |
| 2451897 | 50904 | 15233 | 38271 | 442082 | 2693 | 46104 | 4 | 72 | 1 | 70 | 96.43 | 109.93 | 47.26 | 0.0 | 3308.2 | 6750.1 | 7695.1 | 165.41 | 0.0 | 3308.2 | 3473.61 | -3441.9 |
| 2451897 | NULL | 9497 | 38271 | NULL | NULL | NULL | NULL | NULL | 1 | NULL | NULL | NULL | 21.32 | NULL | NULL | 2389.86 | 4492.62 | NULL | NULL | 527.67 | 538.22 | -1862.19 |
| NULL | 50904 | 15193 | 38271 | 442082 | NULL | 46104 | NULL | NULL | 1 | 24 | NULL | 9.8 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
| 2452129 | 31625 | 1219 | 95293 | 605087 | 4481 | 40275 | 1 | 53 | 2 | 40 | 4.75 | 8.02 | 0.88 | 0.0 | 35.2 | 190.0 | 320.8 | 1.4 | 0.0 | 35.2 | 36.6 | -154.8 |
| 2452129 | 31625 | 16745 | 95293 | 605087 | 4481 | 40275 | 1 | 130 | 2 | 99 | 70.57 | 141.14 | 77.62 | 0.0 | 7684.38 | 6986.43 | 13972.86 | 537.9 | 0.0 | 7684.38 | 8222.28 | 697.95 |
+------------------+------------------+-------------+-----------------+--------------+--------------+-------------+--------------+--------------+-------------------+--------------+--------------------+----------------+-----------------+----------------------+---------------------+------------------------+--------------------+-------------+----------------+--------------+----------------------+----------------+--+
10 rows selected (38.474 seconds)
0: jdbc:hive2://172.17.0.13:10000>
submit query in spark-sql
run spark-sql in a pod
bin/spark-sql \
--master k8s://https://kubernetes.default.svc.cluster.local:443 \
--name spark-sql \
--conf spark.executor.instances=1 \
--conf spark.kubernetes.driver.pod.name=sparkthrift-pod \
--conf spark.driver.host=sparkthrift-headless.default.svc.cluster.local \
--conf spark.driver.port=1888 \
--conf spark.kubernetes.container.image=zhixingheyitian/spark:spark2.4.1
submit query
spark-sql> SELECT * FROM tpcds_1g_parquet.store_sales limit 10;
19/05/07 12:26:01 INFO FileSourceStrategy: Pruning directories with:
19/05/07 12:26:01 INFO FileSourceStrategy: Post-Scan Filters:
19/05/07 12:26:01 INFO FileSourceStrategy: Output Data Schema: struct<ss_sold_date_sk: int, ss_sold_time_sk: int, ss_item_sk: int, ss_customer_sk: int, ss_cdemo_sk: int ... 21 more fields>
19/05/07 12:26:01 INFO FileSourceScanExec: Pushed Filters:
19/05/07 12:26:02 INFO CodeGenerator: Code generated in 303.215875 ms
19/05/07 12:26:02 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 293.8 KB, free 413.6 MB)
19/05/07 12:26:02 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 25.5 KB, free 413.6 MB)
19/05/07 12:26:02 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on sparkthrift-headless.default.svc.cluster.local:40599 (size: 25.5 KB, free: 413.9 MB)
19/05/07 12:26:02 INFO SparkContext: Created broadcast 0 from processCmd at CliDriver.java:376
19/05/07 12:26:02 INFO FileSourceScanExec: Planning scan with bin packing, max size: 54957761 bytes, open cost is considered as scanning 4194304 bytes.
19/05/07 12:26:02 INFO SparkContext: Starting job: processCmd at CliDriver.java:376
19/05/07 12:26:02 INFO DAGScheduler: Got job 0 (processCmd at CliDriver.java:376) with 1 output partitions
19/05/07 12:26:02 INFO DAGScheduler: Final stage: ResultStage 0 (processCmd at CliDriver.java:376)
19/05/07 12:26:02 INFO DAGScheduler: Parents of final stage: List()
19/05/07 12:26:02 INFO DAGScheduler: Missing parents: List()
19/05/07 12:26:02 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[3] at processCmd at CliDriver.java:376), which has no missing parents
19/05/07 12:26:02 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 19.7 KB, free 413.6 MB)
19/05/07 12:26:02 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 6.3 KB, free 413.6 MB)
19/05/07 12:26:02 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on sparkthrift-headless.default.svc.cluster.local:40599 (size: 6.3 KB, free: 413.9 MB)
19/05/07 12:26:03 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1161
19/05/07 12:26:03 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[3] at processCmd at CliDriver.java:376) (first 15 tasks are for partitions Vector(0))
19/05/07 12:26:03 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
19/05/07 12:26:03 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, 172.17.0.15, executor 1, partition 0, ANY, 8421 bytes)
19/05/07 12:26:03 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 172.17.0.15:42745 (size: 6.3 KB, free: 413.9 MB)
19/05/07 12:26:04 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 172.17.0.15:42745 (size: 25.5 KB, free: 413.9 MB)
19/05/07 12:26:48 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 44960 ms on 172.17.0.15 (executor 1) (1/1)
19/05/07 12:26:48 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
19/05/07 12:26:48 INFO DAGScheduler: ResultStage 0 (processCmd at CliDriver.java:376) finished in 45.248 s
19/05/07 12:26:48 INFO DAGScheduler: Job 0 finished: processCmd at CliDriver.java:376, took 45.377769 s
2451897 50904 5695 38271 442082 2693 46104 4 26 1 91 47.2 65.13 31.26 0.0 2844.66 4295.2 5926.83 170.67 0.02844.66 3015.33 -1450.54
2451897 50904 1871 38271 442082 2693 46104 4 263 1 75 83.79 93.84 3.75 0.0 281.25 6284.25 7038.0 14.06 0.0281.25 295.31 -6003.0
2451897 50904 3533 38271 442082 2693 46104 4 179 1 15 90.79 128.01 24.32 240.76 364.8 1361.85 1920.15 8.68 240.76 124.04 132.72 -1237.81
2451897 50904 14989 38271 442082 2693 46104 4 89 1 33 28.98 40.57 23.53 0.0 776.49 956.34 1338.81 7.76 0.0776.49 784.25 -179.85
2451897 50904 7928 38271 442082 2693 46104 4 30 1 14 87.61 145.43 84.34 0.0 1180.76 1226.54 2036.02 11.8 0.01180.76 1192.56 -45.78
2451897 50904 15233 38271 442082 2693 46104 4 72 1 70 96.43 109.93 47.26 0.0 3308.2 6750.1 7695.1 165.41 0.03308.2 3473.61 -3441.9
2451897 NULL 9497 38271 NULL NULL NULL NULL NULL 1 NULL NULL NULL 21.32 NULL NULL 2389.86 4492.62 NULL NUL527.67 538.22 -1862.19
NULL 50904 15193 38271 442082 NULL 46104 NULL NULL 1 24 NULL 9.8 NULL NULL NULL NULL NULL NULL NULNULL NULL NULL
2452129 31625 1219 95293 605087 4481 40275 1 53 2 40 4.75 8.02 0.88 0.0 35.2 190.0 320.8 1.4 0.035.2 36.6 -154.8
2452129 31625 16745 95293 605087 4481 40275 1 130 2 99 70.57 141.14 77.62 0.0 7684.38 6986.43 13972.86 537.9 0.0 7684.38 8222.28 697.95
Time taken: 50.035 seconds, Fetched 10 row(s)
19/05/07 12:26:48 INFO SparkSQLCLIDriver: Time taken: 50.035 seconds, Fetched 10 row(s)
submit query in spark-shell
run spark-shell in a pod
bin/spark-shell \
--master k8s://https://kubernetes.default.svc.cluster.local:443 \
--name spark-shell \
--conf spark.executor.instances=1 \
--conf spark.kubernetes.driver.pod.name=sparkthrift-pod \
--conf spark.driver.host=sparkthrift-headless.default.svc.cluster.local \
--conf spark.driver.port=1888 \
--conf spark.kubernetes.container.image=zhixingheyitian/spark:spark2.4.1
submit query
# bin/spark-shell --master k8s://https://kubernetes.default.svc.cluster.local:443 --name spark-shell --conf spark.executor.instances=1 --conf spark.kubernetes.driver.pod.name=sparkthrift-pod --conf spark.driver.host=sparkthrift-headless.default.svc.cluster.local --conf spark.driver.port=1888 --conf spark.kubernetes.container.image=zhixingheyitian/spark:spark2.4.1
19/05/07 12:37:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
19/05/07 12:37:59 WARN Utils: Service 'sparkDriver' could not bind on port 1888. Attempting port 1889.
19/05/07 12:38:00 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
Spark context Web UI available at http://sparkthrift-headless.default.svc.cluster.local:4041
Spark context available as 'sc' (master = k8s://https://kubernetes.default.svc.cluster.local:443, app id = spark-application-1557232681808).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.4.1
/_/
Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_201)
Type in expressions to have them evaluated.
Type :help for more information.
scala> spark.sql("SELECT * FROM tpcds_1g_parquet.store_sales limit 10").show()
+---------------+---------------+----------+--------------+-----------+-----------+----------+-----------+-----------+----------------+-----------+-----------------+-------------+--------------+-------------------+------------------+---------------------+-----------------+----------+-------------+-----------+-------------------+-------------+
|ss_sold_date_sk|ss_sold_time_sk|ss_item_sk|ss_customer_sk|ss_cdemo_sk|ss_hdemo_sk|ss_addr_sk|ss_store_sk|ss_promo_sk|ss_ticket_number|ss_quantity|ss_wholesale_cost|ss_list_price|ss_sales_price|ss_ext_discount_amt|ss_ext_sales_price|ss_ext_wholesale_cost|ss_ext_list_price|ss_ext_tax|ss_coupon_amt|ss_net_paid|ss_net_paid_inc_tax|ss_net_profit|
+---------------+---------------+----------+--------------+-----------+-----------+----------+-----------+-----------+----------------+-----------+-----------------+-------------+--------------+-------------------+------------------+---------------------+-----------------+----------+-------------+-----------+-------------------+-------------+
| 2451897| 50904| 5695| 38271| 442082| 2693| 46104| 4| 26| 1| 91| 47.2| 65.13| 31.26| 0.0| 2844.66| 4295.2| 5926.83| 170.67| 0.0| 2844.66| 3015.33| -1450.54|
| 2451897| 50904| 1871| 38271| 442082| 2693| 46104| 4| 263| 1| 75| 83.79| 93.84| 3.75| 0.0| 281.25| 6284.25| 7038.0| 14.06| 0.0| 281.25| 295.31| -6003.0|
| 2451897| 50904| 3533| 38271| 442082| 2693| 46104| 4| 179| 1| 15| 90.79| 128.01| 24.32| 240.76| 364.8| 1361.85| 1920.15| 8.68| 240.76| 124.04| 132.72| -1237.81|
| 2451897| 50904| 14989| 38271| 442082| 2693| 46104| 4| 89| 1| 33| 28.98| 40.57| 23.53| 0.0| 776.49| 956.34| 1338.81| 7.76| 0.0| 776.49| 784.25| -179.85|
| 2451897| 50904| 7928| 38271| 442082| 2693| 46104| 4| 30| 1| 14| 87.61| 145.43| 84.34| 0.0| 1180.76| 1226.54| 2036.02| 11.8| 0.0| 1180.76| 1192.56| -45.78|
| 2451897| 50904| 15233| 38271| 442082| 2693| 46104| 4| 72| 1| 70| 96.43| 109.93| 47.26| 0.0| 3308.2| 6750.1| 7695.1| 165.41| 0.0| 3308.2| 3473.61| -3441.9|
| 2451897| null| 9497| 38271| null| null| null| null| null| 1| null| null| null| 21.32| null| null| 2389.86| 4492.62| null| null| 527.67| 538.22| -1862.19|
| null| 50904| 15193| 38271| 442082| null| 46104| null| null| 1| 24| null| 9.8| null| null| null| null| null| null| null| null| null| null|
| 2452129| 31625| 1219| 95293| 605087| 4481| 40275| 1| 53| 2| 40| 4.75| 8.02| 0.88| 0.0| 35.2| 190.0| 320.8| 1.4| 0.0| 35.2| 36.6| -154.8|
| 2452129| 31625| 16745| 95293| 605087| 4481| 40275| 1| 130| 2| 99| 70.57| 141.14| 77.62| 0.0| 7684.38| 6986.43| 13972.86| 537.9| 0.0| 7684.38| 8222.28| 697.95|
+---------------+---------------+----------+--------------+-----------+-----------+----------+-----------+-----------+----------------+-----------+-----------------+-------------+--------------+-------------------+------------------+---------------------+-----------------+----------+-------------+-----------+-------------------+-------------+
更多推荐
已为社区贡献6条内容
所有评论(0)