报错:It appears that you are attempting to reference SparkContext from a broadcast variable, action
报错_pickle.PicklingError: Could not serialize object: Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation.SparkContext can only
·
报错
_pickle.PicklingError: Could not serialize object: Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation.
SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.
_pickle.PicklingError: Could not serialize object: Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation.
SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.
代码
def decode_itemid(p):
hbase_util = HappyBaseUtil(hbase_host_bc.value, hbase_port_bc.value)
for row in p:
userid = row[0]
rowkey_user = 'map_user_rev_' + str(userid)
user_json = hbase_util.get_row(self.hbase_table_for_decode_itemid, rowkey_user,
columns=["info:message"])
user_decode_id = json.loads(user_json.get("info:message"))........
rec_result1 = predictions.mapPartitions(decode_itemid)
原因
self.sc 在worker中运行导致的; self.spark 和self.sc 只能在master中运行
代码错误
user_json = hbase_util.get_row(self.hbase_table_for_decode_itemid, rowkey_user,
columns=["info:message"])
#TODO: self.hbase_table_for_decode_itemid导致的报错
解决办法
改为
#decode_itemid函数外 ,添加下列代码
hbase_table_for_result_recall = self.hbase_table_for_result_recall
hbase_table_for_result_recall_bc = self.sc.broadcast(hbase_table_for_result_recall)
#decode_itemid函数内,修改代码
user_json = hbase_util.get_row(hbase_table_for_decode_itemid_bc.value, rowkey_user,
columns=["info:message"])
更多推荐
已为社区贡献1条内容
所有评论(0)