I am having an issue creating a new column in my Spark dataframe. I'm attemping to create a new column using withColumn() as follows:
.withColumn('%_diff_from_avg',
((col('aggregate_sales') - col('avg_sales')) / col('avg_sales') * 100))
This results in some values calculated correctly, but most of the values in my resultant table are null. I don't understand why.
Interestingly, when I drop the '* 100' from the calculation, all my values are populated correctly - i.e. no nulls. For example:
.withColumn('%_diff_from_avg',
((col('aggregate_sales') - col('avg_sales')) / col('avg_sales')))
seems to work.
So it seems that the multiplication by 100 is causing the issue.
Can anyone explain why?
所有评论(0)